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Method for producing a nucleotide sequence construct with optimised codons for an HIV 
genetic vaccine based on a primary, early HIV isolate and synthetic envelope BX08 
constructs. 

5 Field of the invention 

The invention relates to a DNA vaccine against HIV, which is designed from a clinical 
primary isolate. One aspect of the invention relates to a method of producing a nucleotide 
sequence construct, in a prefered aspect based on a cassette system, the nucleotide 
sequence construct being used as a DNA vaccine. The method can, for example, lead to the 
10 disclosed synthetic BX08 HIV-1 envelope vaccine nucleotide sequence construct, designed 
to generate suitable DNA vaccines against HIV, specifically HIV-1 . Furthermore, the 
invention can be used for the production of recombinant protein antigens. 

Background of the invention 

1 5 There is an urgent need for new vaccine strategies against HIV. One such new promising 
strategy is called genetic immunisation or DNA vaccine (Webster et al 1997). Some of the 
advantages of a DNA vaccine against HIV is the induction of Th cell activation, induction of 
antibodies also against conformational dependent epitopes, and the induction of cellular 
immunity. So far, most DNA vaccine envelope genes tried, have been from tissue culture 

20 adapted virus strains (Boyer et al 1997) that often differs in several aspects from primary 
clinical isolates (such as early isolates) e.g. in co-receptor usage (Choe et al 1996, Dragic et 
al 1997). 

One disadvantage in HIV envelope based DNA vaccines may be the intrinsic relatively low 
25 expression which is regulated by the Rev expression. This may prevent an optimal 
investigation of the vaccines in small animal models like mice where Rev is functioning 
suboptimally. Recently it has been shown using the tissue culture adapted HIV-1 MN strain, 
that an exchange of the HIV codon usage to that of highly expressed mammalian genes 
greatly improves the expression in mammalian cell tines and renders the HIV expression Rev 
30 independent (Haas et al 1996). Additionally, it is known that rare codons cause pausing of 
the ribosome, which leads to a failure in completing the nascent polypeptide chain and a 
uncoupling of transcription and translation. Pausing of the ribosome is thought to lead to 
exposure of the 3' end of the mRNA to cellular ribonucleases. 
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The world-wide spread of HIV-1 has presently resulted in 8,500 new infections daily and 
AIDS is now the number 1 cause of death among US males (and number 3 among US 
females) aged 25-40 years. The epidemic hot-spots now include Eastern Europe, India and 
5 South East Asia and southern Africa. The attempts to solve this world-wide problem involve 
education, prevention, treatment and vaccine development. Affordable protective vaccines 
represent the best solution to the world-wide problem of infection with HIV-1 . Induction of 
virus neutralising antibodies is one of the key components in vaccine development. Several 
recombinant envelope vaccines have been tested in humans and animals, however, they 

10 seem unable to induce sufficient protection. In this respect DNA vaccination may provide a 
different and more natural mode of antigen presentation. It is hoped that the immune 
responses induced by such DNA vaccines could aid in limiting virus replication, slowing 
disease progression or preventing occurrence of disease. Unfortunately many HIV envelope 
vaccines induce only moderate levels of antibodies. This could in part be due to limitations in 

1 5 expression, influenced by regulation by the Rev protein and by a species-specific and biased 
HIV codon usage. Also the virus variability is considered a barrier for development of 
antibody based vaccines and thus a tool for more easy producing of closely related vaccine 
variants is needed. 

20 It has been suggested to improve the immunogenicity and antigenicity of epitopes by certain 
mutations in the envelope gene. An elimination of certain immune dominant epitopes (like 
V3) could render less immune dominant but more relevant, conserved, or hidden epitopes 
more immunogenic (Bryder et al 1999). Also elimination of certain N-linked glycosylation 
sites could improve the exposure of relevant epitopes and increase the immunogenicity of 

25 those epitopes. Thus, it is possible that elimination of the glycosylation sites in V1 and V2 
may in a more favourable way expose neutralising epitopes (Kwong et al 1998, Wyatt et al 
1998). The HIV envelope contains putative internalisation sequences in the intracellular part 
of gp41 (Sauter et al 1996). Thus it would be relevant to eliminate and/or mutate the 
internalisation signals in a membrane bound HIV envelope vaccine gene to increase the 

30 amount of surface exposed vaccine derived HIV glycoproteins as gp150. Since the antibody 
response, that is measured and calculated in titers, is improved by adding the secreted 
gp120 as opposed to adding the membrane bound form (Vinner et al 1999), it could be 
advantageous to express the vaccine as a secreted gp120 or a secreted gp140. This would 
include important parts of gp41, such as the 2F5 neutralising linear epitope (Mascola et al 

35 1997). 
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Summary of the invention 

Our suggested solution to the problems described above is to design DNA envelope 
vaccines from a clinical primary isolate with Rev-independent high expression in mammals, 
that is built as a cassette for easy variant vaccine production. 

5 

A method of producing a nucleotide sequence construct with codons from highly expressed 
mammalian proteins based on a cassette system coding for an early, primary HIV envelope 
is described. The method comprises the steps of direct cloning of an HIV gene t derived from 
a patient within the first 12 months of infection, thereby obtaining a first nucleotide sequence; 

10 designing a second nucleotide sequence utilising the most frequent codons from mammalian 
highly expressed proteins to encode the same amino acid sequence as the first nucleotide 
sequence; redesigning the second nucleotide sequence so that restriction enzyme sites 
surround the regions of the nucleotide sequence encoding functional regions of the amino 
acid sequence and so that selected restriction enzyme sites are removed, thereby obtaining 

1 5 a third nucleotide sequence encoding the same amino acid sequence as the first and the 
second nucleotide sequence; redesigning the third nucleotide sequence so that the terminals 
contain convenient restriction enzyme sites for cloning into an expression vehicle; producing 
snuts between restriction enzyme sites as well as terminal snuts and introducing snuts into 
an expression vehicle by ligation. The nucleotide sequence construct obtained by this 

20 method uses the mammalian highly expressed codons (figure 1) and renders the envelope 
gene expression Rev independent and allows easy cassette exchange of regions 
surrounded by restriction enzyme sites that are important for immunogenicity, function, and 
expression. 

25 The method can, for example, lead to the disclosed synthetic, Rev-independent, clinical 
(such as early), primary HIV-1 envelope vaccine gene, built as a multi cassette. From the 
sequence of the envelope of the HIV-1 BX08 isolate (personal communication from Marc 
Girard, Institute Pasteur, Paris), the present inventors have designed a synthetic BX08 HIV-1 
envelope vaccine nucleotide sequence construct. 

30 

With the great diversity of envelopes in HIV among different patients and within one patient, 
it would be of advantage to vaccinate with several envelope variants, all being highly 
expressed. To avoid synthesising several full length envelopes, it is much easier to 
exchange relevant parts of an envelope cassette to various strains in a multivalent vaccine 

35 
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Whether it is the disclosed synthetic BX08 nucleotide sequence construct, or any of the 
nucleotide sequence constructs obtained by the method, they are designed to generate 
suitable DNA vaccines against HIV, specifically H1V-1. In this case the mammal, preferably a 
human being, is inoculated with the nucleotide sequence construct in an expression vehicle 
5 and constitutes a host for the transcription and translation of the nucleotide sequence 
construct. The nucleotide sequence constructs of the present invention can furthermore be 
used for the production of recombinant protein antigens. In this case the nucleotide 
sequence construct is placed in an expression vehicle and introduced into a system (e.g. a 
cell-line), allowing production of a recombinant protein with the same amino acid sequence. 
10 The recombinant protein is then isolated and administered to the mammal, preferably a 
human being. The immune system of the mammal will then direct antibodies against 
epitopes on the recombinant protein. The mammal, preferably a human being, can thus be 
primed or boosted with DNA and/or recombinant protein obtained by the method of the 
invention. 

15 

A relevant HIV DNA vaccine can potentially be used not only as a prophylactic vaccine, but 
also as a therapeutic vaccine in HIV infected patients, e.g. during antiviral therapy. An HIV 
specific DNA vaccine will have the possibility to induce or re-induce the wanted specific 
immunity and help the antiviral therapy in limiting or even eliminating the HIV infection. The 

20 immunogenicity and antigenicity of epitopes in the envelope can be improved by certain 
mutations in the envelope gene. The cassette system allows for easy access to the relevant 
parts of the envelope gene, and thereby eased efforts in the process of genetic manipulation. 
Some suggested mutations are: an elimination of certain immune dominant epitopes (like 
V3); elimination of certain N-linked glycosylation sites (like glycosylate sites around V2); 

25 elimination and/or mutation of the nucleotide sequence encoding the intemalisation signals in 
the cytoplasmic part of a membrane bound HIV envelope to increase the amount of surface 
exposed vaccine derived HIV glycoproteins; elimination or mutation of the cleavage site 
between gp120 and gp41; with introduced mutations in gp41 for preserving conformational 
epitopes. 

30 

Table 1 below, lists the nucleotide sequence constructs of the invention by the names used 
herein, as well as by reference to relevant SEQ ID NOs of DNA sequences, and the ammo 
acid sequence encoded by the DNA sequence in the preferred reading frame. It should be 
noted, that the snut name consist of the number of the approximate position for the end of 
35 the snut and the restriction enzyme used to cleave and/or ligate that end of the snut. 
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Table 1 List of names of nucleotide sequence constructs (Snuts (S) and Pieces (P)) 
with reference to SEQ ID NO for the nucleotide sequence and protein sequence. 



Name 


Nucleotide SEQ ID 


Protein SEQ ID 




NO: 


NO: 


So-N-Lang 


1 


2 


$235EcoRV 


3 


4 


S375Pstl 


5 


6 


S495Clal 


7 


8 


Se50-720EcoRI 


9 


10 


SgooXbat 


11 


12 


SggoSact 


13 


14 


Smospel 


15 


16 


Si265Xhol 


17 


18 


Si265gp120 


19 


20 


Si265gp160 


21 


22 


SueSPstI 


23 


24 


Oi^fiSP^tl rvs 


25 


26 


1 vOvAUdr 


27 


28 


^1700Fant 

1 1 UUL0UI 


29 


30 


Si890Htnd!ll 


31 


32 


S2060Sacil 


33 


34 


S2190Clal 


35 


36 


S2330Pstl 


37 


38 


S2425ES 


39 


40 


Pi 


41 


42 


P 2 


43 


44 


P 3 


45 


46 


P3GVI 


47 


48 


P3 GV1V2 


49 


50 


P3GV2 


51 


52 


P4gp160 


53 


54 


P4gp150 


55 


56 


P4gp140 


57 


58 


P 5 


59 


60 


Pagp160 


61 


62 


P8gp150 


63 


64 


Pegp140 


65 


66 


synBX08-140 


67 


68 


synBX08-150 


69 


70 


synBX08-160 


71 


72 


synBX08-120 


73 


74 


synBX08-41 


75 


76 
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Detailed disclosure of the invention 

One aspect of the present invention relates to a method for producing a nucleotide sequence 
construct coding for an HIV gene. The nucleotide sequence construct is produced as a 
cassette system consisting of snuts. A snut (S) is a nucleotide sequences construct between 
5 restriction enzyme cleavage sites comprising the minimal entity of the cassette system. 

First an HIV gene is obtained from a patient within the first 12 months of infection. The term 
HIV should be understood in the broadest sense and include HIV 1 and HIV 2. It is possible 
to determine the period in which the infection has taken place with an accuracy depending 

10 on the frequency of the blood tests taken from the patient. For example, patients suffering 
from various diseases such as lack of certain factors in their blood or hepatitis have their 
blood tested on a regular basis making it possible to determine the period in which the 
infection has taken place. Apart from patients with diseases wherein blood tests are used te 
monitor the course of the disease, other groups of patients have blood tests taken, e.g. blood 

15 donors. Unfortunately, humans are still infected due to transfer of virus in blood samples, 
medical equipment, etc., making it possible to determine the date where the infection has 
taken place within the time frame of a few days. The importance of obtaining the virus early 
in the course of the infection is due to the known fact that many early isolates share the 
common feature of staying relatively constant in their envelope sequences (Karlsson et al. , 

20 1 998). As these early isolates may share cross-reactive antibody- and/or T-cell epitopes a 
vaccine based on such early isolates would have a better chance of inducing immune 
response to shared epitopes of the virus. It is believed that an early, directly cloned virus 
isolate will share immunogenic sites with other early virus isolates seen during an HIV 
infection, so that if a mammal generates antibodies and/or T-cells directed against these 

25 epitopes, the transferred virus will be eliminated prior to the extensive mutations that may 
occur after approximately 12 months of infection. Thus, the virus should be isolated as early 
as possible, that is within the first 12 months of infection, such as 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 
1 , or 0.5 month after infection. 

30 The HIV gene for genetic vaccine is preferably cloned directly from viral RNA or from proviral 
DNA. Direct cloning in this application stands for the virus not being multiplied in stable cell 
lines in vitro. It is presently expected that passing the virus through a stable cell line will 
promote mutation in the virus gene. It is particularly preferred not to pass the virus through 
cells lines selecting for viruses with CXCR4 receptor usage. Direct cloning also includes 

35 multiplication of virus in e.g. PBMC (peripheral blood mononuclear cells) since ail virus can 
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multiply in PBMC, and this type of multiplication generally does not select for CXCR4 
receptor usage. Multiplication of virus is often necessary prior to cloning. Preferably cloning 
is performed directly on samples from the patient. In one embodiment of the invention, 
cloning is performed from patient serum. The cloning is then performed directly on the HIV 
5 virus, that is on RNA. In another embodiment of the invention cloning, is performed from 
infected cells. The cloning is then performed on HIV virus incorporated in the genes in an 
infected cell (e.g. a lymphocyte), that is on DNA. In the latter case the virus might be a silent 
virus, that is a non-replicating virus. To evaluate if the virus is silent, capability of 
multiplication in e.g. PBMC is tested. 

10 

Cloning is a technique well known to a person skilled in the art. A first nucleotide sequence is 
hereby obtained. In another aspect of the invention, the first nucleotide sequence, sharing 
the properties mentioned with direct cloning, is obtained by other means. This could be from 
a database of primary isolates or the like. 

15 

Based on the first nucleotide sequence, the amino acid sequence encoded by said 
nucleotide sequence is determined. A second nucleotide sequence encoding the same 
amino acid sequence is then designed utilising the most frequent codons from highly 
expressed proteins in mammalians (e.g. figure 1 presenting the most frequent codons from 
20 highly expressed proteins in humans). 

Presently, it appears that the usage of the most frequent codons from mammalian highly 
expressed proteins has two advantages: 1) the expression is Rev independent; 2) the level 
of expression is high. The Rev independence is especially advantageous when performing 

25 experiments in mice where the Rev systems is functioning sub-optimally. For the use in 
human vaccine, Rev independence and high expression are important to increase the 
amount of antigen produced. The determination of the codons for high expression is in this 
context based on the statistics from human highly expressed proteins (Haas, Park and Seed, 
1996 hereby incorporated by reference). It is contemplated that the expression of a protein 

30 can be even higher, when current research in binding between codon (on the mRNA) and 
anticodon (on the tRNA) reveals codons with optimal binding capabilities, and when 
interactions in-between codons and/or in-between anticodons are known. 

The second nucleotide sequence designed utilising optimised codons is then redesigned to 
35 obtain a third nucleotide sequence. The purpose of the redesigning is to create unique 
restriction enzyme sites around the nucleotide sequence encoding functional regions of the 
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amino acid sequence. By having unique restriction enzyme sites around the nucleotide 
sequence encoding functional regions of the amino acid sequence, the nucleotide sequence 
encoding functional regions of the amino acid sequence can easily be isolated, changed, and 
re-inserted. Examples of functional regions of the amino acid sequence are transmembrane 
5 spanning regions, immunodominant regions, regions with antibody cross reacting domains, 
fusion domains and other regions important for immunogenicity and expression such as 
variable region 1 (V1), variable region 2 (V2), variable region 3 (V3), variable region 4 (V4) 
and variable region 5 (V5). 

10 It is important to select the restriction enzymes sites with care. By changing the second 
nucleotide sequence to insert restriction enzyme sites around the nucleotide sequence 
encoding functional regions of the amino acid sequence, the third nucleotide sequence must 
still code for the same amino acid sequence as the second and first nucleotide sequence do. 
Thus, if necessary, the second nucleotide sequence is redesigned by changing from 

15 optimised codons to less optimal codons. It is understood, that the restriction enzyme sites 
around the nucleotide sequence encoding functional regions of the amino acid sequence 
should preferably be placed in the terminal region of the nucleotide sequence encoding 
functional regions of the amino acid sequence. That is preferably outside the nucleotide 
sequence encoding functional regions of the amino acid sequence, such as 90 nucleotides 

20 away, e.g. 81, 72, 63, 54, 45, 36, 27, 21, 18, 15, 12, 9, 6,3 nucleotides away, but could also 
be inside the nucleotide sequence encoding functional regions of the amino acid sequence, 
such as 54, 45, 36, 27, 21, 18, 15, 12, 9, 6, 3 nucleotides inside the nucleotide sequence 
encoding the functional region of the amino acid sequence. 

25 The type of restriction enzyme sites allowed is determined by the choice of expression 

vector In certain cases, the number of restriction enzyme sites is limited and it is hard, if not 
impossible, to place unique restriction enzyme sites around all the nucleotide sequences 
coding for functional regions of the amino acid sequence. This problem can be solved by 
dividing the entire nucleotide sequence into pieces, so that each piece comprises only 

30 unique restriction enzyme sites. Modifications to each of the piece is performed separately 
prior to assembly of the pieces. It is preferred that the nucleotide sequence is divided into 9 
pieces. In another aspect, the nuclotide sequence is divided into 8 pieces, or 7, or 6, or 5. or 
4, or 3, or 2 pieces. It is especially preferred that the nucleotide sequence is divided into 3 
pieces. 



35 
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Thus, the redesign of the second nucleotide sequence is an interaction between the choice 
of cloning vector, expression vector, selection of restriction enzyme sites, dividing into 
pieces, and exchange of codons to insert restriction enzyme sites. In a preferred 
embodiment of the present invention the cloning vector is Bluescript allowing the restriction 
5 enzyme sites chosen from the group consisting of: Eag\, Mlu\, EcoRV, Pst\ t C/al, EcoRI, 
Xba\, Sad, Spel, Xho\, HindUl SacW, Wo/I, BamHI, Sma\, Sa/I, Oral, Kpn\. If other cloning 
vectors are chosen, other restriction enzyme sites will be available as known by the person 
skilled in the art. 

10 As a part of the redesigning of the second nucleotide sequence, selected restriction enzyme 
sites may be removed. The selected restriction enzyme sites to be removed are those sites 
that are sites of the same type as the ones already chosen above and that are placed within 
the same piece. The removal of these restriction enzyme sites is performed by changing 
from optimised codons to less optimal codons, maintaining codons for the same amino acid 

15 sequence. 

The third nucleotide sequence is redesigned so that the terminal snuts contain convenient 
restriction enzyme sites for cloning into an expression vehicle. The expression "vehicle" 
means any nucleotide molecule e.g. a DNA molecule, derived e.g. from a plasmid, 

20 bacteriophage, or mammalian or insect virus, into which fragments of nucleic acid may be 
inserted or cloned. An expression vehicle will contain one or more unique restriction enzyme 
sites and may be capable of autonomous replication in a defined host or vehicle organism 
such that the cloned sequence is produced. The expression vehicle is an autonomous 
element capable of directing the synthesis of a protein. Examples of expression vehicles are 

25 mammalian plasmids and viruses, tag containing vectors and viral vectors such as 

adenovirus, vaccinia ankara, adenoassociated virus, cannarypox virus, simliki forest virus 
(sfv), Modified Vaccinia Virus Ankara (MVA), and simbis virus. In one embodiment of the 
invention, the expression vector contains tag sequences In another embodiment of the 
invention a bacteria is transformed with an expression plasmid vector and the bacteria is 

30 then delivered to the patient. Preferred expression vehicles are simliki forest virus (sfv), 
adenovirus and Modified Vaccinia Virus Ankara (MVA). 

The snuts are produced by techniques well known by the person skilled in the art. The 
preferred method for synthesising snuts, is herein referred to as "the minigene approach" 
35 wherein complementary nucleotide strands are synthesised with specific overhanging 
sequences for annealing and subsequent ligation into a vector. This can be performed with 



WO 00/2956! 



10 



PCT/DKOO/00144 



two sets of complementary nucleotide strands, or with three sets of complementary 
nucleotide strands. The minigene approach minimises the known PGR errors of mismatches 
and/or deletions, which may occur due to hairpins in a GC rich gene with mammalian highly 
expressed codons. In figures 10-21, the production of a representative selection of snuts is 
5 illustrated. 

For the production of long snuts, that is snuts of more than about 240 nucleotides, the 
technique of overlapping PCR is preferred as illustrated in figure 8. Herein two nucleotide 
strands about 130 nucleotides long with an overlap are filled to obtain a double strand, which 
10 is subsequently amplified by PCR. 

For the production of multiple snuts with a length of less than about 210 nucleotides, one 
preferred technique is normal PCR. In a preferred production technique the snuts are 
synthesised with the same 5* flanking sequences and with the same 3' flanking sequences,- 
15 as illustrated in figure 9. The advantages of this approach is, that the same PCR primer set 
can be used for amplification of several different snuts. 

As known by the person skilled in the art, special conditions have to be used for each 
individual PCR reaction and it should be optimised to avoid inherent problems like deletions 
20 mismatches when amplificating GC rich genes from synthetic ssDNA material. Whichever of 
the above mentioned techniques are used, it is well known by the person skilled in the art, 
that it will be necessary to correct unavoidable mismatches produced either due to the 
nucleotide strand synthesis material and/or the PCR reaction. This can be performed by site 
directed mutagenesis techniques. 

25 

After the various snuts have been produced, they are assembled into pieces and 
subsequently into the complete gene. Methods for assembly (such as ligation) are well 
known by the person skilled in the art. 

30 

In a preferred embodiment of the present invention the HIV gene encodes the entire HIV 
envelope. It is understood that the HIV envelope can be the full length envelope gp160 as 
well as shorter versions such as gp1 50, gp1 40, and gp1 20 with or without parts of gp4 1 

35 As will be known by the person skilled in the art, the HIV is divided into several groups. 
These groups presently include group M, group O, and group N. Further, the HIV is divided 
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into subtypes A, B, C, D, E, F, G, H, I, and J. In the present invention subtype B is preferred 
due to the high prevalence of this subtype in the Western countries. 

The determination of groups and subtypes is based on the degree of nucleotide sequence 
5 identity in the envelope gene is presently defined as follows: If the sequence identity is more 
than 90% the viruses belong to the same subtype; If the sequence identity is between 80% 
and 90% the viruses belong to the same group. If the sequence identity is less than 80% the 
viruses are considered as belonging to different groups. 

10 One aspect of the invention relates to a nucleotide sequence construct in isolated form which 
has a nucleotide sequence with the general formula (I), (II), (III), or (IV) 



(I) 


p, 


"S495Clar 


■Se50-720EcoRI" 


P 2 " 


■Si265gp120 






(II) 


Pi 


"S495Cla!" 


Se50-720EcoRr 


Pr 


■Si265Xhor 


Si465Pstr 


P4gp140 


(III) 


P, 


-S495Clal' 


S650-720ECORI" 


Pr 


-Si265Xhol* 


Sl465Pstl" 


P4gp150 


(IV) 


P, 


_ S495Clar 


S65O-720ECORI* 


Pr 


■Si265Xhor 


$1465Pstl - 


P4gp160" S2060Sacir P5 



wherein designates the nucleotide sequence SEQ ID NO:41, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 90% 
thereto; 

wherein S49 5C i a ! designates the nucleotide sequence SEQ ID NO: 7, a nucleotide sequence 
20 complementary thereto, or a nucleotide sequence with a sequence identity of at least 95% 
thereto; 

wherein S 650 .72oecori designates the nucleotide sequence SEQ ID NO: 9, a nucleotide 
sequence complementary thereto, or a nucleotide sequence with a sequence identity of at 
least 95% thereto; 

25 wherein P 2 designates the nucleotide sequence SEQ ID NO: 43, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% 
thereto; 

wherein S 12 65 gp i2o designates the nucleotide sequence SEQ ID NO: 19, a nucleotide 
sequence complementary thereto, or a nucleotide sequence with a sequence identity of at 
30 least 70% thereto; 

wherein Si 26 5Xhoi designates the nucleotide sequence SEQ ID NO: 17, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 80% 
thereto; 

wherein Suespsti designates the nucleotide sequence SEQ ID NO: 23, a nucleotide sequence 
35 complementary thereto, or a nucleotide sequence with a sequence identity of at least 90% 
thereto; 
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wherein P 4gp i4o designates the nucleotide sequence SEQ ID NO; 57, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% 
thereto; 

wherein P 4gP i5o designates the nucleotide sequence SEQ ID NO: 55, a nucleotide sequence 
5 complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% 
thereto; 

wherein P 4gp16 o designates the nucleotide sequence SEQ ID NO: 53, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% 
thereto; 

10 wherein S 20 6osacii designates the nucleotide sequence SEQ ID NO: 33, a nucleotide 

sequence complementary thereto, or a nucleotide sequence with a sequence identity of at 
least 98% thereto; and 

wherein P 5 designates the nucleotide sequence SEQ ID NO: 59, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% • 
15 thereto. 

The design of the parent synthetic BX08 gp160 envelope cassette gene with its variant 
length genes gp150, gp140, gp120 is outlined in figure 2. 

20 The nucleotide sequence construct with the formula (I) 

(I ) Pi - S495ClarS650-720EcoRrP2"S 1 265gp1 20 

(visualised in figure 3) (SEQ ID NO: 73) codes for the amino acid sequence of gp120 (SEQ 
ID NO: 74). This amino acid sequence is the part of the HIV envelope that is secreted. Thus, 
it contains the immunogenic epitopes without being bound to the cell membrane. This is of 
25 particular advantage if the nucleotide sequence construct is used for production of 

recombinant antigens or for a DNA vaccine as the antibody immune response may be higher 
to secreted versus membrane bound HIV antigens. 

The nucleotide sequence construct with the formula (II) 

30 (II) PrS 4 95ClarS650-720EcoRrP2~Si265Xhor Si465Pstr P4gp140 

(visualised in figure 4) (SEQ ID NO: 67) codes for the amino acid sequence of gp140 (SEQ 
ID NO: 68). This amino acid sequence encodes the gp120 and the extracellular part of the 
gp41 protein. The amino acid sequence is secreted due to the lack of the transmembrane 
spanning region. This is of particular advantage if the nucleotide sequence construct is used 
35 for production of recombinant antigens as the immunogenic and/or antigenic epitopes in the 
extracellular part of gp41 are included and is of particular advantage for a DNA vaccine as 
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the antibody immune response may be higher to secreted gp120 versus membrane bound 
HIV antigens. 

The nucleotide sequence construct with the formula (III) 

5 (III) Pl-S495Clal"S650-720EcoRrP2-Sl265Xhor Sl465P$tl _ P4gp150 

(visualised in figure 5) (SEQ ID NO: 69) codes for the amino acid sequence of gp150 (SEQ 
ID NO: 70). This amino acid sequence contains all of the envelope protein gp160 except the 
c-terminal tyrosin containing internalisation signals in the intracellular part of gp41. The 
membrane bound surface expression of the amino acid sequence is thereby maintained and 
10 enhanced. Mimicking the organisation of the native epitope conformation may by expected, 
making this nucleotide sequence construct of particular advantage if the nucleotide 
sequence construct is used as a vaccine. 

The nucleotide sequence construct with the formula (IV) 

15 (IV) Pl"S495ClarS 6 50-720EcoRrP2-Sl265Xbor Si465Pstr P4gp160~ S 2 060Sacir P5 

(visualised in figure 6) (SEQ ID NO: 71) codes for the amino acid sequence of gp160 (SEQ 
ID NO: 72) i.e. the entire envelope. 

The nucleotide sequence construct designated P, comprises the nucleotide sequence 
20 encoding the amino acid sequence in the first variable region (V1) and the amino acid 
sequence in the second variable region (V2). In one embodiment of the invention the first 
variable region is surrounded by EcoRV and Pstl restriction enzyme sites, and the second 
variable region is surrounded by Pstl and Clal restriction enzyme sites but as stated above, 
the choice of restriction enzyme sites can alter. 

25 

The nucleotide sequence construct designated S 650 .7 2 oecori comprises the nucleotide 
sequence encoding the amino acid sequence in the third variable region (V3). In one 
embodiment of the present invention S 6 so-72oecori is characterised by the restriction enzyme 
sites EcoRI and Xbal in the terminals. 

30 

The nucleotide sequence construct designated P 2 comprises the nucleotide sequence 
encoding the amino acid sequence of the fourth variable and constant region (V4 and C4) in 
one embodiment of the present invention the forth variable region is surrounded by Sac\ and 
Xho\ restriction enzyme sites. 

35 
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The nucleotide sequence construct designated S 126 5gpi2o comprises the nucleotide sequence 
encoding amino acid sequence of the fifth variable and constant region (V5 and C5). 
Si265g P i2o further comprises a nucleotide sequence encoding a C-terminal stop codon. 

5 The nucleotide sequence construct designated P 4gp i4o comprises the nucleotide sequence 
encoding amino acid sequence of the transmembrane spanning region. P 4gp140 further 
comprises a nucleotide sequence encoding a C-terminal stop codon prior to the 
transmembrane spanning region. 

1 0 The nucleotide sequence construct designated P 4 gpi 6 o comprises the nucleotide sequence 
encoding amino acid sequence of the transmembrane spanning region (trans membrane 
spanning domain: TMD). In a preferred embodiment of the present invention the 
transmembrane spanning region is surrounded by Hindlll and Sacll restriction enzyme sites. 

15 The term "sequence identity" indicates the degree of identity between two amino acid 

sequences or between two nucleotide sequences calculated by the Wilbur-Lipman alignment 
method (Wilbur et al, 1983). 

The nucleotide sequence constructs with the formula (I), (II), (III), or (IV) illustrates the 
20 flexibility in the present invention. By producing a gene with the described method enables 
the production of a plethora of antigens with various immunogenic epitopes and various 
advantages for production and vaccine purposes. To further illustrate the flexibility of the 
invention, other changes and mutations are suggested below. 

25 In order to improve the immunogenicity of the nucleotide sequence constructs of the 
invention it is suggested to change the nucleotide sequence such that one or more 
glycosylation sites are removed in the amino acid sequence. By removal of shielding 
glycosylates, epitopes are revealed to the immunesystem of the mammal rendering the 
construct more immunogenic. The increased immunogenicity can be determined by an 

30 improved virus neutralisation. Changes in the nucleotide sequence such that one or more re- 
linked glycosylation sites are removed in the amino acid sequence is well known by the 
person skilled in the art. Potential glycosylation sites are N in the amino acid sequences N-X- 
T or N-X-S (wherein X is any amino acid besides P). The glycosylation site can be removed 
by changing N to any amino acid, changing X to a P, or changing T to any amino acid. It is 

35 preferred that N is changed to Q by an A to C mutation at the first nucleotide in the codon 
and a C to G mutation at the third nucleotide in the codon. This is preferred to increase the 
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GC content in the nucleotide sequence construct. As an alternative N is changed to Q by an 
A to C mutation at the first nucleotide in the codon, and a C to A mutation at the third 
nucleotide in the codon. Preferred mutations in the synthetic BX08 envelope gene to remove 
potential N-lmked glycosylation sites in V1 and/or V2 are A307C + C309A and/or A325C + 
5 C327G and/or A340C + C342A and/or A385C + C387A and/or A469C + C471 A. Examples 
of such changes is illustrated in SEQ ID NOs: 47, 49, and 51. 

For historical reasons the HIVs have been divided into syncytia inducing strains and non 
syncytia inducing strains. The assay to determine whether a strain is syncytia inducing is 

10 described in Verrier et al 1997, hereby incorporated by reference. It is presently known, that 
viruses utilising the CXCR4 co-receptor are syncytia inducing strains. It is also, at the 
present, known that the binding site for the CXCR4 involves the third variable region (V3). In 
a preferred embodiment the nucleotide sequence construct is changed to create a binding 
site for the CXCR4 co-receptor. It is presently performed in the third variable regions, 

1 5 preferably by the mutation G865C + A866G. 

It is well established that the HIV envelope comprises immunodominant epitopes. An 
immunodominant epitope is an epitope that most antibodies from the mammal are directed 
against. The antibodies directed against these immunodominant epitopes may have little 

20 effect in elimination of the virus. It is therefore anticipated that modification of the 

immunodominant epitopes will induce antibodies directed against other parts of the envelope 
leading to a better elimination and neutralisation of the virus. By modification is understood 
any change in the nucleotide sequence encoding an immunodominant epitope in the amino 
acid sequence such that said amino acid sequence no longer contains an immunodominant 

25 epitope. Thus, modification includes removal of the immunodominant epitope and decrease 
of immunogenicity performed by mutagenesis. In a preferred embodiment of the present 
invention an immunodominant epitope in the third variable region (V3) is modified, such as 
deleted or altered. In a much preferred embodiment the nucleotides 793-897 are deleted. 
In yet another preferred embodiment of the present invention an immunodominant epitope 

30 has been removed from gp41 , such as deleted. This is performed in P 7 or P 8 by elimination 
of the nucleotides 1654-1710. 

It is anticipated that when gp120 is dissociated from gp41 in a vaccine or antigen, two 
immunodominant epitopes, one on each protein, are exposed and antibodies are directed 
35 against these in the mammal. In the infectious virus, gp120 is coiled on top of gp41 and the 
gp120/gp41 is most likely organised in a trimer, so that these immunodominant epitopes are 
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hidden and therefore less elimination of virus is observed. By removing the cleavage site 
between gp41 and gp120 a full length gp160, gp150, or gp140 can be obtained with a 
covalent binding between gp41 and gp120. Removal of the cleavage site between gp41 and 
gp120 is preferably performed by a mutation at position C1423A. An example of such a 
5 mutation is illustrated in the mutation of S 1265X hoi (SEQ ID NO: 17) to S 126 5gpi6o (SEQ ID NO: 
21). 

In order to stabilise the full length gp160, gp150, and gp140 for example when the cleavage 
site between gp41 and gp120 has been removed as described above, cysteins can be 
10 inserted, preferably inside the gp41 helix creating disulphide bounds to stabilise a trimer of 
gp41s. In a preferred embodiment of the present invention the cysteins are inserted by the 
mutation 1618:CTCCAGGC:1625 to 1618:TGCTGCGG:1625. An example of such a change 
is illustrated in SEQ ID NO: 25. 

15 The above mentioned decrease in immunodominant epitopes combined with the increase in 
immunogenicity of the other epitopes is expected to greatly enhance the efficacy of the 
nucleotide sequence construct as a vaccine. 

During the production of the nucleotide sequence construct, it is convenient to ligate the 
20 snuts into pieces. The pieces, as described above, are characterised by their reversible 
assembly as there are no duplicate restriction enzyme sites. In a preferred embodiment one 
piece (herein designated P 3 ) contains P lt S 4 95ciai.S 6 5o-720EcoRi, and P 2 . Another piece (herein 
designated P 8 ) contains S^sxm. S 14 e5Psti, and P 4gp i6o- Yet another piece (herein designated 
P7) contains S^esxhot. S^sspsti, P4gpieo. S 20 6osacii. and P 5 . 

25 

One advantage of the present nucleotide sequence construct is the easy access to 
exchange and alterations in the content and function of the nucleotide sequence and the 
encoded amino acid sequence. In one embodiment the nucleotide sequence coding for a 
functional region or parts thereof of the amino acid sequence is repeated. The repeat could 
30 be back-to-back or a functional region or parts thereof could be repeated somewhere else in 
the sequence. Repeated could mean two (one repetition) but could also be three, six, or nine 
repeats. In a much preferred embodiment the repetition nucleotide sequence codes for 
amino acids in the third variable region. 

35 In order to improve the protective capabilities of the invention against infections with HIV 
one embodiment of the invention relates to the combination of epitopes. The present 
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nucleotide sequence construct allows insertion of one or more new nucleotide sequences 
isolated from another group and/or subtype of HIV and/or isolated from another patient. 
Hereby a vaccine or antigen with two or more epitopes from two or more HIVs is obtained. In 
a preferred embodiment, the V3 is replaced by the new nucleotide sequence. In a much 
5 preferred embodiment, the new nucleotide sequence codes for amino acids in the third 
variable region of a different HIV isolate. 

In order to improve the efficacy of the vaccine, aiming at raising cellular immunity, a 
nucleotide sequence coding for a T-helper celt epitope is included in the nucleotide 

10 sequence construct. The nucleotide sequence coding for a T-helper cell epitope or a T- 
helper cell epitope containing amino acid sequence can be put in anywhere in the nucleotide 
sequence construct as long as it does not interact with the function of the envelope molecule. 
However, it is preferably placed in the tail of the nucleotide sequence construct or between 
the leader sequence and the envelope gene. The T-helper epitopes are preferably selected 

15 from core proteins such as P24gag or from a non-HIV pathogen such as virus, bacteria, e.g. 
BCG antigen 85. For a therapeutic vaccine an HIV helper epitope is preferred since the 
patient is already primed by the HIV infection. For a prophylactic vaccine, a T-helper cell 
epitope from a frequently occurring non HIV pathogen such as Hepatitis B, BCG, CMV, EBV 
is preferred. Also, since the synthetic BX08 envelope genes may contain T-helper cell 

20 epitopes in addition to important antibody epitopes, the synthetic BX08 vaccine genes can be 
mixed with other DIM A vaccines to improve the efficacy of the other DNA vaccine. 

One aspect of the present invention relates to individualised immunotherapy, wherein the 
virus from a newly diagnosed patient is directly cloned, the envelope or subunits 

25 corresponding to snuts or pieces is produced with highly expressed codons, inserted into any 
of the nucleotide sequence constructs described above and administered to the patient as a 
vaccine. Hereby a therapeutic DNA vaccine is obtained, that will help the patient to break 
immunetolerance or induce/reinduce an appropriate immune response. In one embodiment 
the variable regions of the virus are produced with highly expressed codons and exchanged 

30 into any of the nucleotide sequence constructs described above. 

In one embodiment of the invention, the nucleotide sequence construct as described above 
satisfies at least one of the following criteria: 

a) serum extracted from a Macaque primate which has been immunised by administration of 
35 an expression vector containing the nucleotide sequence construct is capable of eliminating 
SHIV as determined by quantitative PCR and/or virus culturing. 
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b) serum extracted from a primate which has been immunised by administration of an 
expression vector containing the nucleotide sequence construct is capable of neutralising 
HIV-1 BX08 and /or other HIV-1 strains in vitro. 

c) serum, extracted from a mouse which has been immunised by administration of an 

5 expression vector containing the nucleotide sequence construct four times in intervals of 
three weeks and boosted after 15 weeks, is capable of decreasing the concentration of HIV- 
antigen in a culture of HIV, serum or PBMCs by at least 50%. An example of such procudure 
is shown in example 9. 

10 In one embodiment of the invention, the nucleotide sequence construct of the invention, is 
used in medicine. That is, it is used as a vaccine, for the production of a recombinant protein, 
such that the recombinant protein is used as a vaccine, or the nucleotide sequence construct 
or the recombinant protein is used in a diagnostic composition. 

Thus, the nucleotide sequence construct of the invention can be used for the manufacture of 
1 5 a vaccine for the prophylactics of infection with HIV in humans. 

Intramuscular inoculation of nucleotide constructs, i.e. DNA plasmids encoding proteins have 
been shown to result in the generation of the encoded protein in situ in muscle cells and 
dendritic cells. By using cDNA plasmids encoding viral proteins, both antibody and CTL 

20 responses were generated, providing homologous and heterologous protection against 
subsequent challenge with either the homologous or cross-strain reaction, respectively. 
The standard techniques of molecular biology for preparing and purifying DNA constructs 
enable the preparation of the DNA therapeutics of this invention. While standard techniques 
of molecular biology are therefore sufficient for the production of the products of this 

25 invention, the specific constructs disclosed herein provide novel therapeutics which can 
produce cross-strain protection, a result heretofore unattainable with standard inactivated 
whole virus or subunit protein vaccines. 

The amount of expressible DNA to be introduced to a vaccine recipient will depend on the 
strength of the transcription and translation promoters used in the DNA construct, and on the 

30 immunogenicity of the expressed gene product. In general, an immunologically or 

prophylactically effective dose of about 10 \ig to 300 (ag is administered directly into muscle 
tissue. Subcutaneous injection, intradermal introduction, impression through the skin, 
inoculation by gene gun preferably DNA coated gold particles, and other modes of 
administration such as intraperitoneal, intravenous, peroral, topic, vaginal, rectal, intranasal 

35 or by inhalation delivery are also contemplated. It is also contemplated that booster 
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vaccinations are to be provided. It is further contemplated that booster vaccinations with 
recombinant antigens are to be provided, administered as described above. 

The DNA may be naked, that is, unassociated with any proteins, adjuvants or other agents 
5 which impact on the recipients immune system. In this case, it is desirable for the DNA to be 
in a physiologically acceptable solution, such as, but not limited to, sterile saline or sterile 
buffered saline. Alternatively, the DNA may be associated with surfactants, liposomes, such 
as lecithin liposomes or other liposomes, such as ISCOMs, known in the art, as a DNA- 
liposome mixture, (see for example WO93/24640) or the DNA may be associated with and 
10 adjuvant known in the art to boost immune responses, such as a protein or other carrier. 
Agents which assist in the cellular uptake of DNA, such as, but not limited to, calcium ions, 
detergents, viral proteins and other transfection facilitating agents may also be used to 
advantage. These agents are generally referred to as transfection facilitating agents and as 
pharmaceutically acceptable carriers. 

15 

Those skilled in the field of molecular biology will understand that any of a wide variety of 
expression systems may be used. A wide range of suitable mammalian cells are available 
from a wide range of sources (e.g. the American Type Culture Collection, Rockland, Dm; 
also, see e.g. Ausubel et at. 1992). The method of transformation or transfection and the 
20 choice of expression vehicle will depend on the host system selected. Transformation and 
transfection methods are described e.g. in Ausubel et al 1992; expression vehicles may be 
chosen from those provided e.g. in P.H. Pouwels et al. 1985. 

In one embodiment of the present invention the protein encoded by the nucleotide sequence 
25 construct is produced by introduction into a suitable mammalian cell to create a stably- 
transfected mammalian cell line capable of producing the recombinant protein. A number of 
vectors suitable for stable transfection of mammalian cells are available to the public e.g. in 
Cloning Vectors: A Laboratory manual (P.H. Pouwels et al. 1985); methods for constructing 
such cell lines are also publicly available, e.g. in Ausubel et al. 1992. 

30 

Standard reference works describing the general principles of recombinant DNA technology 
include Watson, J.D. et al 1987; Darnell, J.E. et al 1986; Old, R.W. et al, 1981; Maniatis T et 
al 1 989; and Ausubel et al. 1 992. 
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Figure legends 

The invention is further illustrated in the following non-limiting examples and the drawing 
wherein 

5 Figure 1 provides the codon preference of highly expressed proteins in human cells. 

Figure 2 illustrates the outline of gp120, gp140, gp150, and gp160 encoding synthetic genes 
derived from the wild type sequence at the top. Variable (V) and constant (C) regions are 
shown together with the leader peptide (LP) and the transmembrane spanning domain 
10 (TMD). The approximate nucleotide positions of the restriction enzyme sites are shown. 
The approximate position of the three restriction enzyme sites dividing the full-length 
gp160 gene into the three pieces each containing only unique restriction enzyme sites are 
shown in bold. 

15 Figure 3 building of the synthetic gp120 gene. Variable (V) and constant (C) regions are 

shown together with the leader peptide (LP) and the transmembrane spanning domain ( 
TMD). The approximate nucleotide positions of the restriction enzyme sites are shown. 

Figure 4 building of the synthetic gp140 gene. Variable (V) and constant (C) regions are 
20 shown together with the leader peptide (LP) and the transmembrane spanning domain ( 
TMD). The approximate nucleotide positions of the restriction enzyme sites are shown. 

Figure 5 building of the synthetic gp150 gene. Variable (V) and constant (C) regions are 
shown together with the leader peptide (LP) and the transmembrane spanning domain ( 
25 TMD). The approximate nucleotide positions of the restriction enzyme sites are shown. 

Figure 6 building of the synthetic gp160 gene. Variable (V) and constant (C) regions are 
shown together with the leader peptide (LP) and the transmembrane spanning domain ( 
TMD). The approximate nucleotide positions of the restriction enzyme sites are shown 

30 

Figure 7 illustrates the codons coding amino acids in general 
Figure 8 illustrates how overlapping PCR is performed. 
35 Figure 9 illustrates how PCR using conserved flanking ends is performed. 
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Figure 10 illustrates how Si 2 65Xh 0 i is produced using complementary strands (minigene- 
approach) technology. The S 126 5Xhoi is ligated from three sets of complementary strands 
into the vector pBluescript KS* between restriction enzyme sites Xho\ and Pstt. 

5 

Figure 11 illustrates how Si 46 5Pstiis produced. The same approach, as the approach used for 
the production of S 1265Xhotl was used except that only two sets of complementary strands 
were used. 

10 Figure 12 illustrates the assembly of P,. The S 0 - N -Lang and S 235 ecorv are ligated into the Xba\ 
and Pst\ site of the S 3 75 Pst , containing plasmid. 

Figure 13 illustrates the assembly of P 2 , The S 90 oxbai was excelled by HindlU and Sad from 
its plasmid and ligated with S 990 saci [Sac\-Spe\) into the S 110SP ai plasmid that was opened . 
15 at the HindlU and Spel sites. 

Figure 14 illustrates the assembly of P 3 . S 495 ciai (Cla\-EcoR\) and S 650 -72oecori (EcoR\-Xba\) 
and P 2 (Xba\~Xho\) were ligated simultaneously into the P t plasmid opened at the C/al 
and Xho\ sites to obtain the P 3 plasmid. 

20 

Figure 15 illustrates the assembly of P 4 gpi 6 o. S 1Q90Hl nditi (Sacl-H//7d!ll) and S 1700e agi (Hind\\\- 
Eag\) were ligated simultaneously into the S^oxnai plasmid opened by Sacll and Eagl. 

Figure 16 illustrates the assembly of P 5 . S 219 oci at (C/al-Ps?l) and S 233 oPsti (Psfl-£coRI) were 
25 ligated into the S 2425 es plasmid opened by C/af and EcoRI. 

Figure 17 illustrates the assembly of P^ieo- S 146 5P S ti (Xba\-Pst\) and S 1265X hoi (Pst\-Xho\) were 
ligated into the P 4 gpi 6 o plasmid opened by Xoal and Xho\. 

30 Figure 18 illustrates the assembly of P 8 g p15 o. S 1465Ps ,i {Xba\-Pst\) and Si 265 xhoi {Pst\-Xho\) were 
ligated into the plasmid containing P 4gp i5o with the stop codon. P 4 g P15 o plasmid was opened 
at the Xba\ and Xho\ sites for the ligation. 

Figure 19 illustrates the assembly of P 8gp140 . S 1465 p s ti (Xba\~Pst\) and S 1265 xhoi {Pst\-Xho\) were 
35 ligated into the plasmid containing P 4gp140 with a stop codon. P 4gp1i0 plasmid was opened 
at the Xbal and Xho\ sites for the ligation. 
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Figure 20 illustrates the assembly of Psgp4i- Two complementary nucleotide strands 
1265gp41S and 1265gp41AS designed with overhang creating a 5' Xho\ and a 3' Pst\ 
restriction enzyme site were anealed and ligated into the piece 8 which is already opened 
5 at the Xho\ and Psrl sites whereby S 126 5 is deleted. 

Figure 21 illustrates the assembly of P 7 . P 8 (Xhol-SacH) and S 20 6osacii (Sacll-C/al) were 
ligated into P 5 plasmid opened at Xho\ and C/al. 

10 Figure 22a SDS PAGE of 35 S-!abelled HIV-1 BX08 envelope glycoproteins radio-immuno 

precipitated from transiently transfected 293 cells using the indicated plasmids. Cell pellet 
(membrane bound antigens) or cell supernatant (secreted antigens) were precipitated by 
a polyclonal anti-HIV-1 antibody pool. Lane 1: untransfected cells. Lane 2: supernatant 
from syn.gp120 MN transfected cells. Lane 3: ceil pellet from wt.gp160 B xos transfected cells.. 

15 Lane 4: cell pellet from cells co-transfected by wt.gp160 BX o8 and pRev. Lane 5: Mwt 
marker. Lane 6: cell pellet from syn.gp160 BX0 8 transfected 293 cells. Lane 7: cell pellet 
from syn.gp150 BX o8 transfected 293 cells. Lane 8: supernatant from syn.gp140 BX0 8 
transfected cells. Lane 9: supernatant from syn.gp120 BX08 transfected cells. 

20 Figure 22b is an SDS-PAGE of 35 S-labeled HIV-1 BX08 envelope glycoproteins radio- 
immune precipitated from transiently transfected 293 cells as cell pellet (membrane 
bound) or cell supernatant (secreted antigens) by anti-HIV-1 antibody pool using the 
indicated plasmids. Lane 1: untransfected 293 cells. Lane 2: cell pellet from syn.gp160MN 
transfected 293 cells as positive control (Vinner et al 1999). Lane 3: Cell supernatant from 

25 syn.gp120MN transfected 293 cells as positive control (Vinner et al 1999). Lane 4: Cell 
supernatant from syn.gp120BX08 transfected 293 cells demonstrating a glycoprotein 
band of 120 kDa. Lane 5: Cell supernatant from syn.gp140BX08 transfected 293 cells 
demonstrating a glycoprotein band of 120 kDa. Lane 6: Mwt. marker. Lane7 at two 
different exposure times: Cell pellet from syn.gp150BX08 transfected 293 cells 

30 demonstrating a glycoprotein band of 120 kDa (lower gp30 band is not well seen in this 
exposure). Lane 8: Cell supernatant from syn.gp150BX08 transfected 293 cells showing 
no secreted proteins (all protein is membrane bound, see lane 7). 

Figure 22c show fluorescent microscopy of U87.CD4.CCR5 cells transfected with BX08 
35 gp1 60 genes plus pGFP. Panel A: cells transfected with empty WRG7079 vector plus 
pGFP showing no syncytia. Panel B: cells transfected with wild type BX08gp160 gene 
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plus pGFP showing some syncytia. Panel C: cells transfected with synBX08gp160 plus 
pGFP showing extreeme degree of syncytia formation. This demonstrates expression, 
functionality, and tropism of the expressed BX08 glycoprotein with much more expressed 
functionally active gp160 from the synthetic BX08 gene. 

5 

Figure 23 shows the anti-Env-V3 BX08 antibody titers (lgG1). Panels show individual mice 
DNA immunized with syn.gp140BX08 plasmid either i.m. (left panel) or by gene gun (right 
panel), respectively. Immunization time points are indicated by arrows. 

10 Figure 24 shows a Western Blotting of (from left to right) one control strip, followed by sere 
(1:50) from 2 mice i.m. immunized with synBX08gp120, 2 mice i.m. immunized with 
synBX08gp140, 2 mice i.m. immunized with synBX08gp150, and 2 mice immunized with 
synBX08gp160, followed by 2 mice gene gun immunized with synBXC3gp120, 2 mice 
gene gun immunized with synBX08gp140, 2 mice gene gun immunized with syn 

15 BX08gp150, and 2 mice gene gun immunized with synBX08gp160 respectively. Strip 5 is 
a mouse 5.1 DNA immunized i.m. with synBX08gp140 plasmid (same mouse as in figure 
23). Plasma was examined at week 18. The positing of gp160 (spiked with four coupled 
gp51), gp120 and gp41 is indicated at the right. A positive reaction to HIV glycoproteins 
futher demonstrates the mouse anti-HIV immunoglobulin reacting to HIV of a strain (IIIB) 

20 different from BX08 to illustrate cross-strain reactivity. 

Figure 25 Theoretical example of calculation of the 50% inhibitory concentration (IC 50 ) 

values. IC 50 for each mouse serum is determined by interpolation from the plots of percent 
inhibition versus the dilution of serum. 

25 

Figure 26 CTL responses were measured at week 18 to the mouse H-2D d restricted BX08 
V3 CTL epitope (IGPGGRAFYTT) for BALB/c mice (H-2D d ) i.m. immunized at week 0, 9, 
and 15 with the synthetic vaccine genes: syn.gp120 B xoe, syn.gp140 B xo8, syn.gp150 B xo8, 
and syn.gp160 BX0 6. respectively, and median values of different E:T ratios for groups of 
30 mice are shown (26A). Intramuscular DNA immunization with syn.gp150 B xo8 induced a 
higher CTL reponse when injected i.m. in high amounts versus gene gun inoculation of 
skin (26B). 

Figure 27 Summary of western immuno blotting assay of mice sera (1:40) collected at week 
35 0, 9. and 18 from mice genetically immunized with syn.gp120Bxo8, syn.gp140 BX0 8, 
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syn.gp150 B xo8, syn.gp160 BX oe, wt.gp160 B xoB, and wt.gp160 B xoe plus pRev, respectively. 
Percent responders in groups of 17-25 mice against gp120 and gp41 are shown. 

Figure 28 IgG anti-rgp120 (I i IB) antibody titers of individual mice inoculated at week 0, 9, 15 
5 (28A), or gene gun immunized at week 3, 6, 9, and 1 5 (28B) with the syn.gpl 50 B xoe DNA 
vaccine. 

Figure 29 IgG antibody titers to HIV-1 rgp120,n B . Median titers are shown from groups of 
mice i.m. inoculated at week 0, 9, and 15 (29A), or gene gun immunized at week 0, 3, 6, 
10 9, and 15 (29B) with the synthetic genes syn.gpl 20 BXO e, syn.gp140 BX08 , syn.gpl 50 B xoe, 
and syn.gpl 60 BX o8, respectively. 
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Examples 

Example 1: Designing the nucleotide sequence construct 

Initially the overall layout of the nucleotide sequence construct is decided. The overall layout 
comprises the various derivatives the gene will be expressed as. For BX08 these include, but 
5 are not restricted to gp160, gp150, gp140, gp120, and gp41. 

Next, the vehicle of expression (plasmid or virus) is to be determined: Preparation for a 
suitable vector determines both need for leader sequence, terminal restriction enzyme sites 
and whether or not an N- or C-terminal protein tag is to be considered (Poly-his, Myc- 
antibody-epitop, etc.)- For BX08 a plasmid expression vehicle was chosen. All native wild 

10 type HIV codons are systematically exchanged with the codons most frequently represented 
in a pool of highly expressed human genes (figure 1). By this exchange the amino acid 
sequence is conserved while the nucleotide sequence is dramatically altered. Thus, gene 
structures like overlapping reading frames (e.g. vpu, rev, and tat) or secondary structures 
(e.g. RRE) are most likely destroyed whereas protein cleavage sites, and glycosylation sites 

1 5 are maintained. The 100% amino acid identity between wtBX08 and synthetic BX08 in the 
present examples should be calculated after the initial Ala-Ser amino acid sequence, as that 
sequence is a part of the 6 amino acid sequence long Nhe\ restriction enzyme site. 

Depending on the restriction enzyme sites located in the expression vector it is decided 
20 which restriction enzyme sites can be present (tolerated) throughout the finished gene 

construct. The terminal restriction enzyme sites of the synthetic gene must remain unique to 
enable cloning into the vector chosen for expression. General requirements for restriction 
enzyme sites of choice: Preferably creating cohesive ends facilitating ligation, creating no 
compatible ends with adjacent restriction enzyme sites (e.g. SamHI/Bg/ll), and being efficient 
25 cutters. For BX08 the restriction enzyme sites accepted were the ones present in the 
polylinker of the pBluescript cloning vectors (Eagl, MM, EcoRV, Pst\, C/al, EcoRI, Xba\, 
Sad, Spel, Xhol Hind\\\, Sacll, Not\, SamHI, Smal, Sa/I, Dra\, Kpn\ with the exception of 
BglU and Nhe\). This was decided to satisfy the original cloning strategy using individual 
cloning of snuts in pBluescript with restriction enzyme cleaved (trimmed) ends after PCR 
30 amplification, which is not necessary when blunt-end cloning and assembling of 

complementary oligonucleotides are employed. All locations at which the selected restriction 
enzyme sites can be introduced by silent mutations (keeping 100% loyal to the amino acid 
sequence) are identified using the SILMUT software or equivalent. 
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From these possible restriction enzyme sites, a selection of restriction enzyme sites are 
introduced by silent nucleotide substitutions around functional regions of choice of the 
corresponding gene (e.g. RRE) or gene products (e.g. variable region 1 (V1), V2, V3, CD4 
binding area, transmembrane domain, and regions of immunological significans, etc.). 
5 Restriction enzyme sites are located at terminal positions of subcloned snuts (building 

entities) but additional restriction enzyme sites may be present within subunits. For BX08 the 
construct was initially to be cloned in the WRG7079 vector containing a tPA-leader 
sequence. Cloning sites were 5'-Nhe\ -> SamHI-3'. The entire humanised BX08 sequence 
was divided into thirds: 5'-A//?el Xho\ Sacll BamHI-3'. These sites were chosen in 
10 this particular order because it resembles the polylinker of pBluescript (KS ) enabling 
successive ligations of the assembled thirds in this cloning vector. Within these thirds 
restriction enzyme sites were kept unique. Next, restriction enzyme sites were placed to flank 
the functional regions chosen as follows: 

A. (5-V1): EcoRV-235: Between C1 and V1. Alternatives: 3xHind\\\ (already excluded 
15 because exclusive use at position 1890) or EcoRV. 

B. (Vl-3'): Only alternative P$t\ 375. 

C. (5-V2): as B. 

D. (V2-3'): Alternatives: Spe\, C/al 495. C/al chosen because it is closer to V2. 
E EcoRI 650 placed because next possible site was too far away. 

20 F. (5-V3): BglU 720 was the alternative closest to the V3 region and further more unique. 

G. (V3-3 ); Alternatives Xho\ (excluded) and Xba\ 900 located very close to the V3 loop. 

H. (5'-V4) Sad 990: alternatively EcoRI or SamHI (both excluded) 

I. (V4-3*): Alternatives Spel 1110, Kpn\ 1145, P$t\ 1135. Pst\ already used, Spel chosen 
because of distance to previous site (Sad 990). 

25 J. (5'-V5): Xho\ initially determined. 

K. (Fusion peptide-3') Pst\ 1465 was the closest alternative to Xho\ 1265. 

L. ^'-Immunodominant region): Xba\ 1630 chosen among EcoRV (blunt end), Psti and Xho\ 

(both already used). 

M. (Immunodominant region-3'): Eagl 1700 perfect location. 
30 N. (C34 and C43 -3' (Chan, Fass, et al. 1997), and S'-trans membrane domain): Sacll. No 
alternatives. 

O. (Trans membrane domain -3'): Sacll 2060 already present. 
P. C/al 2190 perfect position in relation to previous RE-site. 
Q: Pstl perfect position in relation to previous RE-site. 
35 R. EcoRI 2400 introduced to facilitate later substitution of terminal snut. 
S. BamH\ 2454 determined by the WRG7079 vector. 
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Remove undesired restriction enzyme sites by nucleotide substitutions (keeping ioyal to the 
amino acid sequence). Nucleotide substitution should preferably create codon frequently 
used in highly expressed human genes (figure 1). If that is not possible t the codons should 
5 be the selected from the regular codons (figure 7). The substitutions made to the second 
nucleotide sequence to obtain desired restriction enzyme sites are shown in Table 2. 



Table 2 lists silent nucleotide substitutions in the humanised BX08 envelope 
10 sequence. Substitutions were made to create or delete restriction enzyme sites. 



Position: substitution Remarks: 

138 c g creates Mlu I site on pos. 134-139 

240 c t creates EcoRV site on pos. 238-243 

501 c -> a creates Cla I site on pos. 501-506 

502 a ->t do 

503 g — > c do 

504 c -> g do 

657 c a creates EcoRI site on pos. 656-661 

660 c -* t do 

724 c -» a creates Bgl II site on pos. 724-729 

726 c g do 

727 a -> t do 

728 g c do 

729 c -» t do 

840 c -> t Eagl site is eliminated 

904 a -+ t creates Xba I site on pos. 904-909 

905 g -> C do 

906 c t do 

907 c -► a do 
909 c -> a do 

994 a t creates Sac I site on pos. 990-995 

995 g _> c do 

1116 c ->t creates Spel site on pos. 1114-1119 

1119 c -»t do 

1273 a -> t creates Xhol site on pos. 1272-1277 

1274 g^ c do 

1275 c -> g do 

1293 c -> t Bgl II site is eliminated 

1443 c -> t BstXi site is eliminated 

1452 g -> c do 

1467 c -> t Pstl site on pos. 1466-1471 

1470 c-»a do 

1590 g _> c Pstl site on pos. 1588-1593 is eliminated 

1620 g _> c Pstl site on pos. 1618-1623 is eliminated 

1638 c->t creates Xbal site on pos. 1638-1643 

1641 g > a do 

1653 g -> c Pstl site is eliminated 

1687 a -* t Pstl site is eliminated 

1688 g c Pstl site is eliminated 

1710 c -* g creates Eagl site on pos. 1709-1714 

1758 c -> t Bgl II site is eliminated 

1875 g -> c Pstl site is eliminated 

1893 c -»a Hind III on pos. 1893-1898 

1897 c ->t do 

1944 c -> t Bgl II site is eliminated 



WO 00/29561 



28 



PCT/DK00/00144 



Position: 



substitution 



Remarks: 



2199 
2202 
2203 
2253 
2292 
2320 
2321 
2322 
2325 
2430 
2433 



c -+ a 
c -+ a 
c -> t 



a 



c -> t 
c -+ t 
c — t 

c -> g 
g -► c 
a t 



creates EcoRI site on pos. 2429-2434 
do 



do 
do 
do 



Sacll site is eliminated 

Pstl site is eliminated 

Pstl site on pos. 2321-2326 is eliminated 



do 
do 



Cla I site on pos. 2198-2203 



Example 2: synthesis of oligos 

In order to clone the individual snuts, nucleotide strands were synthesised or purchased. In 
total 28 synthetic nucleotide strands were synthesised. Nucleotide strands were synthesised 
5 by standard 0.2 nmol p-cyanoethyl-phosphoramidite chemistry on an Applied Biosystems 
DNA synthesiser model 392, employing 2000 A CPG columns (Cruachem, Glasgow, 
Scotland), acetonitrile containing less than 0.001% water (Labscan, Dublin, Ireland) and 
standard DNA-synthesis chemicals from Cruachem, including phosphoramidites at 0.1 M 
and Tetrahydrofuran/N-methyiimidazole as cap B solution. The nucleotide strands O-N-C 

10 and 1 19MS-RC (for cloning of snut O-N-Lang), 650-E-BG and 720-XBAC-31 (for cloning of 
snut 650-720-EcoRI), 2425esup and 2425ESdo (for cloning of snut 2425-E-S) were 
synthesised with 5' end "trityl on" and purified on "Oligonucleotide Purification Cartridges" 
(Perkin Elmer, CA, USA) as described by the manufacturer. Other nucfeotide strands (235- 
EC05, 375-pst1.seq, 495-Cla1 .seq,900-Xbal, 990-sad, 1110-SPE, 1630-Xba.seq, 1700- 

15 Eag.seq, 17-Eag.seq, 1890-Hind.MPD, 2060-sac, 2190-cla, 2330-pst) were synthesised with 
5' end "trityl off' and purified by standard ethanol precipitation. Oligoes 1265-1 UP, 1265- 
1DO, 1265-2UP, 1265-2DO, 1265-3UP, 1265-3DO, 1465-1UP, 1465-1DO, 1465-2UP, 1465- 
2DO were purchased from Pharmacia. 

Example 3: Cloning of snuts 

20 The nucleotide sequence construct was designed in 17 DNA small pieces called snuts 

(Table 3) encompassing important structures like variable and constant regions each flanked 
with restriction enzyme (RE) sites to facilitate cassette exchange within each third of the 
gene: Nhe\-Xho\. Xhol-SacU, Sac\\-BamH\. 

25 Each snut was cloned individually in a commercial vector (pBluescriptKS or pMOSblue) and 
kept as individual DNA plasmids, named after the snut which gives the nucleotide position of 
the RE in the BX08. 
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Table 3 list the snuts by their name and cloning vector. 



Name Cloning vector: 



So-N-Lang 


pMOSblue 


$235EcoRV 


pMOSblue 


$375Pstl 


pBluescriptSK 


S^sciat 


pMOSblue 


S650-720EcoRI 


pMOSblue 


SgQOXbal 


pMOSblue 


SgwSacI 


pMOSblue 


Smospel 


pMOSblue 


Si265Xhol 


pBluescriptSK 


Si465Pstl 


pBluescriptSK 


Si630Xbai 


pBluescriptSK 


Si700Eagl 


pBluescriptSK 


Sl890Hindlll 


pBluescriptSK 


S2060Sacll 


pMOSblue 


S2i90Clal 


pMOSblue 


S2330PSU 


pMOSblue 


S2425ES 


pBluescriptSK 



Three principally different methods were used to obtain the dsDNA corresponding to each of 
5 the 17 snuts needed to build the synthetic BX08 genes. 

1) "Overlapping" PCR: is based on the use of two ssDNA template nucleotide strands 
(forward and reverse) that complement each other in their 3'-end (figure 8). During the first 
PCR cycle, both templates annealed to each other at the 3'-ends allowing the full-length 
10 polymerisation of each complementary strand during the elongation step. The newly 

Dolymerized dsDNA strand are then amplified during the following cycles using an adequate 
forward and reverse primers set (figure 8). 

Snut O-N-LANG: (So-N-Lang) two ng of the forward template nucleotide strands O-N-C and 2 
15 ng of the reward template-nucleotide strand 1 1 9MS-RC were mixed together with SOpmoles 
of the forward primer O-N-LANG-5 (5'-CTAGCTA-GCGCGGCCGACCGCCT -3 ) and 
SOpmoles of the reverse primer O-N-LANG-3 (S'-CTCGATATCCTCGTGCATCTGCTC -3 ) in 
a lOOfil PCR reaction volume containing 0.2mM dNTP's, 1x ExpandHF buffer with MgCI 2 
(1.5mM) and 2.6 units of enzyme mix (Expand™ High fidelity PCR system from Boehringer 
20 Mannheim). The PCR was performed with the PE Amp 9600 thermocycler (Perkin Elmer) 
using the following cycle conditions: initial denaturation at 94°C for 30 sec, followed by 30 
cycles of 94°C for 15 sec, 65°C for 30 sec, 72°C for 45 sec, with a final elongation at 72'C 
for 5 min., and cooling to 4°C. 
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Snut 650-720-EcoRI: (S 6 5o-72oecori) PCR amplification was performed as described for snut 
O-N-LANG. One \ig of the forward ssDNA template-oligonucleotide 650-E-BG and 1 ^g of 
the reverse ssDNA template-oligonucleotide 720-XBAC were mixed with 40pmoles of the 
forward primer 650-E-5 (5-CCGGAATT-CGCCCCGTGGTGAGCA-3') and 40 pmoles of the 
5 reverse primer 720-X-3 (5'-CTGCTCTAGAGATGTTGCAGTGGGCCT-3'). 

2) "Normal" PCR amplification: Eleven nucleotide strands: 235-EC05, 375-pst1, 900-xba1, 
990-sad, 1110-SPE, 1630-XBA, 1700-EAG, 1890-HIN, 2060-sac, 2190-cla, and 2330-pst, 
were designed with common 5' and 3' flanking sequences which allowed PCR amplification 

10 with the same primer set (Forward primer : BX08-5 (5'-AGCGGATAACAATTTCACACAGGA- 
3') and revers primer : BX08-3 (5'-CGCCAGGGTTTTCCCAGTCACGAC-3) (Figure 9). The 
495-Cla1 oligonucleotide was designed without a common flanking sequence and was 
therefore amplified with a specific set of primers 495-5N/495-3N ( 5'-GAATCGAT- 
CATCACCCAG-3' and 5'-GACGAATTCCGTGGGTGCACT-3'). Each oligonucleotide was 

1 5 resuspended in 1 ml of water and kept as a stock solution (approximative^ 0.2 mM). PCR 
amplification was performed with the Expand™ High Fidelity PCR System from Boehringer 
Mannheim (Cat. No. 1759078). Four concentrations of template nucleotide strand were 
systematically used: undiluted stock solution, stock solution 10"\ stock solution 10~ 2 , stock 
solution 10" 3 . One to 5 ^il of synthetic ssDNA template was amplified using the following 

20 conditions: BX08-5 (0.5^M), BX08-3 (0.5^M), 4 dNTP's (0.2mM), 1x ExpandHF buffer with 
MgCI 2 (1.5mM) and 2.6 units of enzyme mix. The PCR was performed using the PE Amp 
9600 thermocycler (Perkin Elmer) using the following cycle conditions: initial denaturation at 
94°C for 15 sec, followed by 30 cycles of 94°C for 15 sec, 65°C for 30 sec. and 72°C for 45 
sec, with a final elongation at 72°C for 7 min., and cooling to 4°C. 

25 

3) Minigene approach : This method was used to synthesise Simsxhoi, S 146 5Xbai and S 2425E s. 
Snut 2425-E-S (S 242 5es): 100 picomoles of each oligonucleotide 2425ES-up (35-mer ; 5- 
AATTCGCCAGGGCTTCGAGCGCGCCCTGCTGTAAG-3') and 2425ES-do (35-mer ; 
GATCCTTACAGCAGGGCGCGCTCGAAGC-CCTGGCG-3') were mixed together in a 100 nl 

30 final volume of annealing buffer containing NaCI 25mM , Tris 10mM and 1mM EDTA. After 
denaturation at 94°C for 15 min., the mixed oligonucleotides were allowed to anneal at 65°C 
during 15min.. The annealing temparature was allowed to slowly decrease from the 65°C to 
room temperature (22°C) during overnight incubation. The resulting double-strand dsDNA 
fragments harbored EcoRI- and BamHI-restriction sites overhangs that allowed direct cloning 

35 in pBluescnpt KS(+) vector using standard cloning techniques (Maniatis 1996). 
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Snut 1265-Xhol (S 126 5Xhoi): This snut was built according to the strategy depicted in figure 10. 
Three minigenes were constructed following the same method described for snut 2425-E-S. 
These minigenes are named 1265-1, 1265-2 and 1265-3. The minigene 1265-1 results from 
the annealing of the oligonucleotides 1265-1 up (68-mer ; 5'-tcg agc AGC GGC aag gag att 

5 TTC CGC CCC GGC GGC GGC GAC ATGC GCG ACA ACT GGC GCA GCG AGC T-3') and 1 265-1 do (68- 
mer ; 5'-GTA CAG CTC GCT GCG CCA GTT GTC GCG CAT GTC GCC GCC GCC GGG GCG G AAA ATC 

tcc TTG CCG ctg c-3"). 1 265-2 results from the annealing of 1 265-2up (61 -mer ; 5'-gta caa 

GTA CAA GGT GGT GAA GAT CGA GCC CCT GGG CAT CGC CCC CAC CAA GGC CAA GCG C-3') and 
1265-2dO (63-mer ; 5'-CAC GCG GCG CTT GGC CTT GGT GGG GGC GAT GCC CAG GGG CTC GAT 

10 ctt CAC CAC ctt GTA ctt-3'). Finally, 1265-3 results from the annealing of 1265-3up (69-mer 

; 5'-CGC GTG GTG CAG CGC GAG AAG CGC GCC GTG GGC ATC GGC GCT ATG TTC CTC GGC TTC CTG 

ggc gct gca-3') and 1265-3do (59-mer ; 5'-gcg ccc agg aag ccg agg aac ata gcg ccg 
atg CCC ACG GCG CGC ttc tcg CGC TGC AC-3'). Each minigene were designed in order to 
present single strand overhangs at their 5' and 3'- ends that allow easy ligation and Xhoi-Pstl 
1 5 direct cloning into pBlueScript KS+ vector. 

Snut 14$5-Pstl (S 146 5Psti): Two minigenes were constructed following the same methode 
described for snut 2425-E-S. These minigenes are named 1465-1 and 1465-2. The minigene 
1465-1 was obtained after annealing of 1465-1 up (90-mer : 5'-GGC AGC acc atg GGC GCC 

20 GCC AGC CTG ACC CTG ACC GTG CAG GCC CGC CAG CTG CTG AGC GGC ATC GTG CAG CAG CAG 
AAC AAC CTG CTG-3') and 1465-1 do (98-mer : 5'-CGC GCA GCA GGT TGT TCT GCT GCT GCA CGA 
TGC CGC TCA GCA GCT GGC GGG CCT GCA CGG TCA GGG TCA GGC TGG CGG CGC CCA TGG TGC 

TGC CTG CA-3'), whereas minigene 1465-2 results from the annealing of 1465-2up (78-mer ; 

5-CGC GCC ATC GAG GCC CAG CAG CAC CTG CTC CAG CTGA CCG TGT GGG GCA TCA AGC AGC TCC 
25 agg ccc GCG TGC TGG CT-3') and 1465-2do (78-mer ; S'-CTA GAG CCA GCA CGC GGG CCT GGA 
GCT GCT TGA TGC CCC ACA CGG TCA GCT GGA GCA GGT GCT GCT GGG CCT CGA TGG-3'. Each 

minigene were designed in order to present single strand overhangs at their 5' and 3'- ends 
that allow easy ligation and Pstl-Xbal direct cloning into pBlueScript KS+ vector using 
standard cloning techniques (Maniatis) (see figure 11). 

30 Example 4: assembly of snuts to pieces. 

The snut genes were then assembled into pieces (Table 4) so that unique restriction enzyme 
sites or mutagenesis can be used within each of these. This strategy will require fewer 
assemblings for optimal use of the cassette system. The following piece clones were made 
and kept individually for construction of the synBX08 gp160 gene (Figure 6): 
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Table 4 lists pieces by their name and their snut composition. 



Piece 
name 


snut composition 


vector 


P, 


So-N-LANG~S235EcoR\rS375Pst| 


pBluescriptSK 


P 2 


SgooXbarSgsoSacrSi 1 lOSpel 


pMOSblue 


P 3 


Pl-S495ClarS 6 50-720EcoRrP2 


pBluescriptSK 


P4gp160 


Si63OXbarS 17O 0EagrSi890Hinlll 


pBluescriptSK 


P 5 


S2190ClarS2330Pstr$2425ES 


pBluescnptKS 


P 7 


P 8gp160-S2060SacirP5 


pBluescriptKS 




Sl265Xhol~Si465PstrP4ao160 


pBluescnptKS 



Piece 1: The building strategy is shown in figure 12. 

Preparation of the insert DNA: Five to 15ng of each plasmid 0-N-LANG-cl7 and 235-EcoRV- 
5 cl5N, respectively, were double-digested by Xbal/EcoRV, and Pstl/EcoRV, according to 
classical RE digestion procedure ^Maniatis). The RE digestion products, were agarose gel 
purified according classical method (Maniatis). All RE digests were loaded on a 3% Nusieve 
3:1(FMC), TBE 0.5X agarose gel and submitted to electrophoresis (7 Volts/mm during 2- 
3hours) until optimal fragment separation. The agarose-band containing the DNA fragments 

1 0 that correspond to the snut's sequence sizes (243-bp for O-N-LANG and 143-bp for 235- 
EcoRV) were excised from the gel. The DNA was extracted from agarose by centrifugation 
20min at 5000g using a spin-X column (Costar cat#8160). Preparation of the vector: The 
snut 375-Pst1 klonl was used as plasmid vector. Five fig were digested with Xbal and Pstl 
Removal of the polylinker Xbal/Pstl fragment was performed by classical agarose gel 

15 purification, using a 0.9% Seakem-GTG agarose, TBE 0.5X gel. The linearised plasmid DNA 
was extracted from the agarose by filtration through spin-X column. All purified DNA 
fragments were quantified by spectrophotometry. Ligation: All three DNA fragments O-N- 
LANG (Xbal/EcoRV), 235-EcoRV (Pstl/EcoRV) and 375-Pstl(Xbal/Pstl), were iigated 
together by classical ligation procedure, using an equimolar (vector: insert 1:insert2) ratio of 

20 1:1:1. Thus for, 200 ng (0. 1 pmole) of Xbal/Pstl-lineansed 375-Pstl-cl 1 were mixed with 1 6 
ng of O-N-LANG (Xbal/EcoRV) and 10 ng of 235-EcoRV (Pstl/EcoRV) in a final reaction 
volume of 20^1 of 1X ligation buffer containing 10U of T4 DNA ligase (Biolabs, cat#202S). 
The ligation was allowed overnight at 16°C. Transformation: Competent XL1-Blue bacteria 
(Stratagene cat#200130, transformation efficiency > 5«10 6 col/ug) were transformed by 

25 classical heat-chock procedure : 1/1 0th of the pre-chilled ligation reaction was mixed with 
50^1 of competent bacteria. The mixture was allowed to stand in ice during 30 min. Bacteria 
were heat-shocked at 42°C during 45 sec. and then left 2 min. on ice before being 
resuspended in 450 u.1 of SOC medium. Transformed bacteria were incubated 1 hour at 37"C 
under shaking (250rpm) and plated on LB-ampicilin agar plates. The recombinant clones 
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were allowed to grow 16 hours at 37°C. Colony screening: 10 to 50 recombinant colonies 
were screened by direct PCR screening according to the protocole described into the 
pMOSBlue blunt-ended cloning kit booklet (RPN 5110, Amersham). Each colony was picked 
and resuspended in 50jil of water. DNA was freed by a boilling procedure (100°C, 5 
5 min).Ten y\ of bacterial lysate were mixed to 1 ^l of a 10mM solution of premixed 4 dNTP's , 
1 }i\ of M13reverse primer (5pmoles/nl, 5'-CAGGAAACAGCTATGAC-3'), 1 nl of T7 primer 
(Spmoles/^l, 5'-TAATACGACTCACTATAGGG-3') , Sfil of 10x Expand HF buffer 2 
(Boehringer Mannheim, cat#1 759078), 0.5^1 of Enzyme mix (Boehringer Mannheim, 5U/nl) 
in a final volume of 50jd. DNA amplification was performed with a thermo-cycler PE9600 

10 (Perkin-Elmer) using the following cycling parameters: 94°C, 2min, 35 cycles(94°C, 30sec; 
50°C, 15sec; 72°C,30sec); 72°C, 5min; 4°C hold. Five \x\ of the PCR products were analysed 
after electrophoresis on a 0.9% SeakemGTG , O.SxTBE agarose gel. Nucleotide sequence 
confirmation; ds-DNA was purified from minicultures of the selected clones with the JETstar 
mini plasmid purification system ( Genomed Inc.). Sequencing was performed using 

15 M13reverse and T7 primers and with the Big DyeTM Terminator Cycle Sequencing Ready 
reaction kit (Perkin-Elmer, Norwalk, Connecticut, P/N4303152) and the ABI-377 automated 
DNA sequenator (Applied Biosystems, Perkin-Elmer, Norwalk, Connecticut). Data were 
processed with the Sequence Navigator and Autoassembler softwares (Applied Biosystems, 
Perkin-Elmer, Norwalk, Connecticut). 

20 

Piece 2: The strategy for building that piece is depicted in figure 13. RE digestion, DNA 
fragments purification, ligation as well as direct PCR screening of recombinant colonies were 
performed according the same procedures described above for piece 1, except the following: 

- The linearised plasmid 1 1 10-Spel-cl24M1 was used as vector after being digested by 
25 Hindlll and Spel, and agarose gel purified. 

- A 166-bp Hindlll/Sacl, obtained from snut 900-Xbal-cl15, as well as a 130-bp Sacl/Spel 
fragment, obtained from snut 990-Sacl-cl14, were agarose gel purified. 

- Equimolar amount (0.1 pmoles) of the three DNA fragments described above were ligated 
in an one step ligation. 

30 - 100^1 of competent SCS1 10 bacteria (Stratagene cat# 200247) were transformed with 
1/1 0th of the ligation products according to the manufactor instruction. 

- Direct colony PCR screening was performed using T7 primer and pMOS-R (5- 
GTTGTAAAACGACGGCCAG-3'). 
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Piece 3: The strategy for building that piece is depicted in figure 14. RE digestion, DNA 
fragments purification, ligation as well as direct PGR screening of recombinant colonies were 
performed according the same procedures described above for piecel , except the following: 

- The plasmid piece1-cl33 was linearised by Clal and Xhol, in order to be used as vector, 
5 and agarose gel purified. 

- A 161-bp Clal/EcoRl fragment, obtained from 495-Clal-cl135M1 as well as a 254-bp 
EcoRI/Xbal fragment, obtained from 650-720-EcoRI-cl39, and a 374-bp Xbal/Xhol fragment, 
obtained from piece2-cl4, were agarose gel purified. 

- Equimolar amount (0. 1 pmole) of each of these 4 DNA fragments were mixed and ligated 
10 together. 

- 50^1 of competent XL1 Blue bacteria were transformed with 1/1 0th of the ligation products 
according to the protocole described for piecel . 

- Direct colony PCR screening was performed using M13Reverse and T7 primers. 

15 Piece 4 gp160: The strategy for building that piece is depicted in figure 15. RE digestion, 
DNA fragments purification, ligation as well as direct PCR screening of recombinant colonies 
were performed according the same procedures described above for piecel, except the 
following: 

- The plasmid 1630-Xbal-cl2 was linearised by Sacll/ Eagl digestion and agarose gel 
20 purified, in order to be used as vector. 

- A 190-bp Eagl/Hindlll fragment, obtained from snut 1700-Eagl-cl4, as well as a 177-bp 
Sacll/Hindlll fragment, obtained from snut 1890-Hindlll-cl8, were agarose gel purified. 

- Equimolar amount (0. 1 pmoles) of the three DNA fragments described above were ligated 
in an one step ligation. 

25 - 50uJ of competent XL1 Blue bacteria were transformed with 1/1 0th of the ligation products 
according to the protocole described for piecel 

- Direct colony PCR screening was performed using M13Reverse and T7 primers. 

Piece 4-gp150: PCR-based site-directed mutagenesis was performed on double-stranded 
30 plasmid-DNA from piece4-cl4 according an adaptation of the ExSite™ PCR-Based Site- 
Directed Mutagenesis Kit procedure (Stratagene cat#200502)(Weiner t M P., Costa, G.L . 
Schoettlin, W., Cline, J., Marthur, E., and Bauer, J.C. (1994) Gene 151:119-123). The 
mutations introduced are shown in bold letters in the primer sequences below. PCR 
amplification was performed with the Expand™ High Fidelity PCR System (Boehringer 
35 Mannheim, cat#1 759078). Briefly, 1.5 ng, 0.5 ng or 0.1 ng of circular dsDNA was mixed wTtn 
1.5 pmoles of P4M2S (5'-TCTGGAAGCTCAGGGGGCTGCATCCCTGGC-3 ) and 1.5 
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pmoles of P4M2AS (5'-CCCGCCTGCCCGTGTGACGGATCCAGCTCC-3') in a final volume 
of 50^1 containing 4 dNTPs (250jiM each), 1x Expand HF buffer 2 (Boehringer Mannheim, 
cat#1 759078), 0.75^1 of Enzyme mix (Boehringer Mannheim, 5U/^I). The PCR was 
performed with a PE9600 thermo-cycler (Perkin-Elmer Corporation) under the following 
5 cycling parameters : 94°C, 2min ; 15 cycles (94 °C, 45 sec ; 68 °C, 4 min); 72°C, 7min and 4 
°C, hold. PCR products were phenolxhloroform extracted and precipitated (Maniatis). 
Plasmid template was removed from PCR products by Dpnl treatment (Biolabs)(Nelson, M., 
and McClelland, M., 1992) followed by ethanol-precipitation. Ampiicons were resuspended in 
50 \i\ stehl water, and phosphorylated according the following procedure: 7.5 nl of ampiicons 

10 were mixed with 0.5ul of 100mM DTT, 1 \i\ of 10x pk buffer and 1 \i\ of pk mix enzyme 

(pMOSBlue blunt-ended cloning kit, Amersham , cat#RPN 5110). DNA kinasing was allowed 
5min at 22°C. After heat-inactivation (10min, 75°C) of the pk enzyme, 1^1 of ligase (4units, 
Amersham , cat#RPN 51 10) was directly added to the pk reaction. The ligation was allowed 
overnight at 22°C. 50^1 of competent XLIBlue bacteria were transformed with 1/10th of the- 

15 ligation reaction according to the classical protocol (Maniatis). Insertion of mutations was 
checked by sequencing. 

Piece 4-gp140: PCR-based site-directed mutagenesis was performed on piece 4-cl4, 
according to the procedure described for piece4-gp150 except that the primers P4M1AS (5'- 
20 TGTGTGACTGATTGAGGATCCCCAACTGGC-3) and P4S (5'- 
AGCTTGCCCACTTGTCCAGCTGGAGCAGGT-3') were used. 

Snut 1265-Xhol-gp120 : PCR-based site-directed mutagenesis was performed on plasmid 
1265-Xhol-cl2M1 according to the procedure described for piece4-gp150 except that the 
25 primers 1265MAS (5'-CTTCTCGCGCTGCACCACGCGGCGCTTGGC-3') and 1265M2S (5'- 
CGCGCCTAGGGCATCGGCGCTATGTTCCTC-3') were used. 

Snut 1265-Xho1-gp160/uncleaved : PCR-based site-directed mutagenesis was performed 
on plasmid 1265-Xhol-cl2M1 according to the procedure described for piece4-gp150 except 
30 that the primers 1265MAS (5'-CTTCTCGCGCTGCACCACGCGGCGCTTGGC-3) and 
1265M2S (5'-AGCGCCGTGGGCATCGGCGCTATGTTCCTC-3) were used. 

Snut 1465-Pstl-CCG : PCR-based site-directed mutagenesis was performed on plasmid 
1465-Pstl-cl25 according to the procedure described for piece4-gp150 except that the 
35 primers 1465MAS (5'-CTGCTTGATGCCCCACACGGTCAGCTG-3) and 1465MS (5 - 
TGCTGCGGCCGCGTGCTGGCTCTAGA-3 ) were used. 
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Piece 5 : The strategy for building that piece is depicted in figure 16. RE digestion, DNA 
fragments purification, ligation as well as direct PCR screening of recombinant colonies were 
performed according the same procedures described above for piecel, except the following: 
5 - The plasmid 2425-ES-cl2 was linearised by Clal/EcoRI digestion and agarose gel purified, 
in order to be used as vector. 

-A 129-bp Pstl/Clal fragment, obtained 2190-Clal-cl6M15, as well as a 114-bp Pstl/Ecorl 
fragment, obtained from 2330-Pstl-cl8, were agarose gel purified. 

- Equimolar amount (0. 1 pmoles) of the three DNA fragments described above were ligated 
10 in an one step ligation. 

- 50uJ of competent XLIBlue bacteria were transformed with 1/1 0th of the ligation products 
according to the protocole described for piecel . 

- Direct colony PCR screening was performed using T3 (5'-ATTAACCCTCACTAAAG-3 , ) and 
T7 primers. 

15 

piece 8 : The strategy for building that piece is depicted in figure 17. RE digestion, DNA 
fragments purification, ligation as well as direct PCR screening of recombinant colonies were 
performed according the same procedures described above for piecel, except the following: 

- The plasmid piece4-cl4 was linearised by Xbal/Xhol and agarose gel purified, in order to be 
20 used as vector. 

- A 200-bp Xhol/Pstl fragment, obtained from 1 265-Xhol-cl2M1 as well as a 178-bp Pstl/Xbal 
fragment, obtained from 1465-Pstl-cl25 were agarose gel purified. 

- Equimolar amount (0.1 pmole) of these 3 DNA fragments were mixed and ligated together. 

- 50^1 of competent XL1 Blue bacteria were transformed with 1/1 0th of the ligation products 
25 according to the protocole described for piecel . 

- Direct colony PCR screening was performed using T3 and T7 primers. 

piece 8-gp150 : The strategy for building that piece was identical to that of piece 8, except 
that piece4-c!4M3 was used as vector (figure 18). 

30 

piece8-gp150/ uncleaved : The strategy for building that piece is identical to that of piece 8. 
except that piece4 gp160-cl4M3 is used as vector and a 200-bp Xhol/Pstl fragment, obtained 
from snut 1265-Xhol-gp160/uncleaved as well as a 178-bp Pstl/Xbal fragment, obtained from 
snut 1465-Pstl-CCG are used tike inserts. 

35 
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piece 8-gp140 : The strategy for building that piece was identical to that of piece 8, except 
that piece4-cl4M5 was used as vector (figure19). 

piece8-gp140/ uncleaved : The strategy for building that piece is identical to that of piece 8, 
5 except that piece4-cl4M5 is used as vector and a 200-bp Xhol/Pstl fragment, obtained from 
snut 1265-Xhol-gp160/uncleaved as well as a 178-bp Pstl/Xbal fragment, obtained from snut 
1465-Pstl-CCG are used like inserts. 

Piece8-gp41 : The strategy for building that piece is depicted in figure 20. RE digestion, 
10 DNA fragments purification, ligation as well as direct PCR screening of recombinant colonies 
were performed according the same procedures described for synBX08 gp160 gene. A 63 
bp linker is to be made according to the method described for snut 2425-ES, the minigene 
appraoch. Thus for 2 complementary oligonucleotides: 1265-gp41S(5'- 
TCGAGgctagcGCCGTGGGCATCGGCGCTATGTTCCTCGGCTTCCTGGGCGctgca-3') and 
15 1265-gp41AS (5-gCGCCCAGGAAGCCGAGGAAC- 

ATAGCGCCGATGCCCACGGCgctagcC-3' should be annealed together. This synthetic 
linker will be directly ligated into the Xho\ IPstl sites of piece8-klon1 3 from which the snut 
1265-Xhol-cl 2M1 would have been removed. 

20 piece7 : The strategy for building that piece is depicted in figure 21 . RE digestion, DNA 
fragments purification, ligation as well as direct PCR screening of recombinant colonies were 
performed according the same procedures described above for piecel, except the following; 

- The plasmid piece5-cI1 was linearised by Clal/Xhol and agarose gel purified, in order to be 
used as vector,. 

25 - A 798-bp Xhol/Sacll fragment, obtained from piece8-cl13 as well as a 140-bp Sacll/Clal 
fragment, obtained from 2060-Sacll-cl21 were agarose gel purified. 

- The ligation of the 3 fragments was performed using a vectorinsert ratio of 1:1, 1:2 or 1:5. 

- 50yd of competent XL1 Blue bacteria were transformed with 1/1 0th of the ligation products 
according to the protocole described for piecel 

30 - Direct colony PCR screening was performed using M1 3Reverse and T7 primers. 

Example 5: assembly of genes 

synBX08 gp160 gene : The strategy for building that gene is depicted in figure 6. RE 
digestion, DNA fragments purification, ligation as well as direct PCR screening of 
recombinant colonies were performed according the same procedures described for piecel 
35 20 ng of the expression plasmid WRG7079 were digested by Nhel/BamHI. Plasmid DNA- 



WO 00/29561 



38 



PCT/DKOO/00144 



ends were dephosphorylated by Calf Intestin Phosphatase treatment (CIP, Biolabs) 
(Maniatis) to avoid automation of any partially digested vector. CIP enzyme was heat- 
inactivated and removed by classical phenol-chloroforme extraction. A 1277-bp Nhel/Xhol 
fragment, obtained from piece3-cl27, as well as a 1194-bp Xhol/BamHI fragment, obtained 
5 from piece7-cl1 , were agarose gel purified. The ligation was performed using a vector: insert 
ratio of 1:1 or 1 :2. Fifty \x\ of competent XLIBlue bacteria were transformed with 1/1 0th of the 
ligation product according to the protocole described for piecel. After transformation bacteria 
were plated on LB-kanamycin agar plates. Direct PCR colony screening was performed 
using the primer set WRG-F (5'-AGACATAATAGCTGACAGAC-3 ) and WRG-R (5*- 
10 GATTGTATTTCTGTCCCTCAC-3'). The nucleotide sequence was determined according the 
methods described above for piece! 

synBX08 gp150 gene : The strategy for building that gene is depicted in figure 5. RE 
digestion, DNA fragments purification, ligation as well as direct PCR screening of 

15 recombinant colonies were performed according the same procedures described for 

synBX08 gp160 gene. A 1277-bp Nhel/Xhol fragment, obtained from piece3-cl27 t as well as 
a 800-bp Xhol/BamHI fragment, obtained from piece 8-gp1 50-cl26, were agarose gel purified 
and then ligated into the Nhel/BamHI WRG7079 sites. The ligation was performed using a 
vectoninsert ratio of 1:1 or 1:2. 

20 For construction of the synthetic BX08 gp1 50, piece4 was mutated to Piece4gp1 50 whereby 
a tyrosine -> cysteine was changed and a stop codon was introduced after the 
transmembrane spanning domaine ( TMD), followed by a BamHI cloning site. A new 
piece8gp150 was constructed composed of snut1265/snut1465/piece4gp150. 

25 synBX08 gp150/uncleaved gene : RE digestion, DNA fragments purification, ligation as 
well as direct PCR screening of recombinant colonies are performed according the same 
procedures described for synBX08 gp160 gene. A 1277-bp Nhel/Xhol fragment, obtained 
from piece3~cl27, as well as a 800-bp Xhol/BamHI fragment, obtained from piece 8- 
gp150/uncleaved ( are agarose gel purified and then are ligated into the Nhel/BamHI 

30 WRG7079 sites. The ligation was performed using a vectorinsert ratio of 1:1 or 1:2. 

synBX08 gp 140 gene : The strategy for building that gene is depicted in figure 4. RE 
digestion, DNA fragments purification, ligation as well as direct PCR screening of 
recombinant colonies were performed according the same procedures described for 
35 synBX08 gp160 gene. A 1277-bp Nhel/Xhol fragment, obtained from piece3-cl27, as well as 
a 647-bp Xhol/BamHI fragment, obtained from piece 8-gp140-c!2, were agarose gel purified 
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and then ligated into the Nhel/BamHI sites of WRG7079. The ligation was performed using a 
vectorinsert ratio of 1 : 1 or 1 :2. For construction of the synthetic BX08 gp140, piece4 was 
mutated to Piece4gp140 whereby a stop codon was introduced just before the TMD followed 
by a BamHI cloning site. A new piece8gp140 was constructed composed of 
5 snut1265/1465/piece4gp140. 

synBX08 gp140/uncleaved gene : RE digestion, DNA fragments purification, ligation as 
well as direct PCR screening of recombinant colonies are performed according the same 
procedures described for synBX08 gp160 gene. A 1277-bp Nhel/Xhol fragment, obtained 
10 from piece3-cl27 t as well as a 800-bp Xhol/BamHI fragment, obtained from piece 8- 

gp1407uncleaved, are agarose gel purified and then ligated into the Nhel/BamHI WRG7079 
sites. The ligation was performed using a vectoninsert ratio of 1:1 or 1:2. 

synBX08 gp 120 gene : The strategy for building that piece is depicted in figure 3. RE 
15 digestion, DNA fragments purification, ligation as well as direct PCR screening of 
recombinant colonies were performed according the same procedures described for 
synBX08 gp160 gene. A 1277-bp Nhel/Xhol fragment, obtained from piece3-cl27, as well as 
a 206-bp Xhol/BamHI fragment, obtained from 1265-Xhol-gp120-clM5, were agarose gel 
purified and then ligated into the Nhel/BamHI sites of WRG7079. The ligation was performed 
20 using a vectoninsert ratio of 1:1 or 1:2. For construction of the synthetic BX08gp120, snut 
1265 was mutated to S 12 65g P i2o to introduce a stop codon at the gp120/gp41 cleavage site 
followed by a BamHI cloning site 

The gp160, gp150, gp140, and gp120 genes are cloned (Nhel-BamHI) and maintained in an 
25 eucaryotic expression vectors containing a CMV promotor and a tPA leader, but other 

expression vectors may be chosen based on other criteria e.g. antibiotic resistance selection, 
other leader sequences like CDS etc, presence or not of immune stimulatory sequences etc. 

SynBX08 gp41 gene : The strategy for building that gene is depicted in figure 20. RE 
30 digestion, DNA fragments purification, ligation as well as direct PCR screening of 
recombinant colonies were performed according the same procedures described for 
synBX08 gp160 gene. Piece 8-gp41 is ligated with snut 2060-Sacll-klon21 and piece 5 as 
already decribed for the construction of piece 7, creating piece 7-gp41 (P/gpvn)- Subsequently 
the piece 7-gp41 containing the entire gp41 gene will be cloned in WRG7079 using the Nhel 
35 and BamHI sites. 
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Example 6a: High expression by codon optimization 

To analyse the expression of glycoproteins from the wild type and synthetic BX08 envelope 
genes RIPA was performed on transfected mammalian cell lines. Both cell membrane 
associated and secreted HIV-1 glycoproteins from the cell supernatants were assayed. The 
5 envelope piasmids were transfected into the human embryonic kidney cell line 293 (ATCC, 
Rockville, MD) or the mouse P815 (H-2D d ) cell line using calcium phosphate (CellPhect 
Transfection kit, Pharmacia). For radio immune precipitation assay (RIPA), transfected cells 
were incubated overnight, washed twice and incubated for 1 hour with DMEM lacking 
cysteine and methionine (Gibco). Then the medium was replaced with medium containing 50 
10 uCi per ml of [ 35 S] cysteine and 50 uCi/ml of [ 35 S] methionine (Amersham Int., Amersham, 
UK) and incubation continued overnight. Cells were centrifuged, washed twice with HBSS 
and lysed in 1 ml ice-cold RIPA buffer (10 mM Tris, pH 7.4, 150 mM NaCI, 50 mM EDTA, 1% 
Nonidet P-40, 0.5% sodiumdeoxycholate) to detect membrane bound Env glycoproteins. The 
cell lysates were centrifuged for 15 min. at 100,000 x g to remove any undissolved particles • 
15 and 100 ^l immune precipitated with protein A-sepharose coupled human polyclonal IgG 
anti-HJV antibodies (Nielsen et al. t 1987). For analysis of secreted Env glycoproteins 500 y\ 
of the 5 ml supernatants from transfected cells were incubated with protein A-sepharose 
coupled human polyclonal IgG anti-HIV antibodies. After washing three times in cold RIPA 
buffer and once in PBS, the immuno precipitates were boiled for 4 min. in 0.05 M Trts-HCi, 
20 pH 6.8, 2% SDS, 10% 2-mercaptoethanol, 10% sucrose, 0.01% bromophenol blue and 
subjected to SDS PAGE (Vinner et al., 1999). Electrophoresis was carried out at 80 mV for 1 
hour in the stacking gel containing 10% acrylamide, and at 30 mV for 18 hours in the 
separating gradient gel containing 5-15% acrylamide. Gels were fixed in 30% ethanol-10% 
acetic acid for 1 hour, soaked for 30 min. in En3Hance (Dupont #NEF 981), washed 2 *15 
25 min. in distilled water, dried and autoradiography performed on Kodak XAR-5 film. 
Transfection of human 293 cells with the syn.gp120 BX o 8 and syn.gp140 BX oe genes, 
respectively, resulted in high amounts of only secreted HIV-1 glycoproteins (Fig. 22a, lane 9 
and 8). Thus, the synthetic gene in the absence of rev expresses the HIV-1 surface 
glycoprotein of the expected size which is recognised by human anti-HIV-1 anttsera. The 
30 expression of BX08 gp120 was Rev independent and with roughly the same high amount of 
gp120 from the syn.gp120MN gene (Fig. 22a, lane 2). Fig. 22a, lane 6 and lane 7 shows the 
expression of only membrane bound gp160 and gp150 from 293 cells transfected with 
syn.gp160 BX 08 and syn.gp150 B xoe piasmids, respectively. Also transfection with wtgp160 BX ce 
plasmid resulted in a significant expression of membrane bound gp160 despite the absence 
35 of Rev (Fig 22a, lane 3). Co-transfection with equimolar amounts of Rev encoding plasmtd 
seemed to increase this expression somewhat (Fig. 22a. lane 4). This is seen despite the 
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lower transfection effectivity using two plasmids and the use of only half the amount of 
wt.gp160 B xoa DNA when combined with pRev. The amounts of secreted HIV-1 glycoproteins 
from gp120 and gp141 accumulating in the cell supernatants seemed higher than the 
amounts of cell associated glycoproteins at the time of harvesting of the cells. Interestingly, 
5 the amounts of gp1 60 produced from the "humanized" gene were about equal to the 
amounts produced by the wt.gp160 B xo8 + pRev genes, respectively (Fig. 22a, lane 4 and 6). 
The processing of gp160, gp150 and gp140 into gp120 plus a gp41, or fractions of gp41, 
produced from wild type or synthetic genes in the 293 cell-line did not function well under 
these experimental conditions. Same phenomenon was seen in RIPA from 293-CD4 cells 
10 and HeLa-CD4 cells infected by HIV-1 MN (Vinner et al., 19999). Because of the absence of 
CCR5 these cell-lines could, however, not be infected by HIV-1 strain BX08. 

Example 6b: Radio immuno precipitation assay (RIPA) of synthetic BX08 transfected 
cells showing expression of glycoproteins from synthetic BX08 env plasmids 

1 5 The synthetic envelope plasmid DNA were transfected into the human embryonic kidney cell 
line 293 (ATCC, Rockville, MD) using calcium phosphate (CellPhect Transfection kit, 
Pharmacia). For immune precipitation analysis, transfected 293 cells were treated and 
analysed according to the method described in example 6a. 
To analyse expression from these genes, an SDS-PAGE of the 36 S-labelfed HIV-1 

20 envelopes, immune precipitated from the transfected cells is shown in figure 22b. Both cell- 
membrane associated and secreted HIV-1 envelope glycoproteins in the cell supernatants 
were assayed. Transfection of 293 cells with the synthetic BX08 gene encoding gp120 
(syn.gp120BX08) in lane 4, and syn.gp140BX08 (lane 5) that did not contain rev encoding 
regions, resulted in abundant amounts of HIV-1 gp120. Thus, the expressions were Rev 

25 independent and expressed in roughly same high amounts as the syn.gp120MN and 

syn.gp160MN genes (lane 3 and 2, respectively) already showed by our group and others to 
be markedly increased in comparisson with HIV MN wild type genes including rev (Vinner et 
al 1999). 

Transfection with syn.gp150 plasmid (lanes 7 and 8) resulted in significant expression of 
30 membrane associated gp120 and low detactable amounts of truncated form of gp41 (cell 
pellet in lane 7) with no detectable HIV-1 glycoprotein in the cell supernatant lane 8. It is 
concluded that the synthetic BX08 genes expres the envelope glycoproteins of expected size 
which are recognised by human anti-HIV-1 antiserum. 
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Exampel 6C FACS 

To quantitate the surface expression of HIV glycoproteins from the wild type and synthetic 
BX08 envelope genes transfection experiments were done and cell surface expression 
5 examined by FACS (flow cytometer). 

10 jig of the BX08 envelope plasmid (wild type BX08gp160 or synBX08gp160) plus 10 ^g of 
an irrelevant carrier plasmid pBluescript were used to transfect a 80-90% confluent layer of 
293 cells in tissue culture wells (25 cm 2 ) using the CellPect kit (Pharmacia). After 48 hours 

0 cells were Versene treated, washed and incubated with a mouse monoclonal IgG antibodies 
to HIVgp120 (NEA-9301, NEN™, Life-Science Products Inc., Boston) for time 30 min. on wet 
ice followed by washing in PBS, 3% FCS and incubation with Phyto-Erytrin (PE) labelled rat 
anti-mouse lgG1 (Cat #346270, Becton Dickinson) according to the manufacture. After 
washing the cells were fixed in PBS, 1% paraformaldehyd, 3% FCS, and analysed on a 

5 FACS (FACScan, Becton-Dicknsson). Table 5 show in duplicate expression of BX08 gp160 
from 1 1 % of the cells transfected with wild type BX08 (number 1 and 2) compared to the 48 
% of cells expression BX08 glycoprotein when transfected with the synthetic gene (number 3 
and 4). Thus, a several fold higher expression is obtained using the synthetic BX08 gene. 

0 Table 5 FACS analysis of 293 cells transfected with synBX08gp160 (No 1 and 2) and 
wt.gp16+BX08 (No3 and 4) and stained with monoclonal antibodies to surface expressed 
HIV glycoproteins. A higher expression was obtained with the synthetic gene (mean 48%) as 
compared to the wild type gene (mean 1 1 %). 





50 ul 


45 ul 


A 


B 


C 


C-A 


C-B 


1 


syn.gp160BX08 + 


pBluescript SK+ 


2,57 


2,85 


36,91 


34,34 


34,06 


2 


syn.gp160BX08 + 


pBluescript SK+ 


2,83 


2,14 


58,42 


55,59 


56,28 


3 


wt.gp160BX08 + 


pBluescript SK+ 


1,95 


152 


7,51 


5,56 


5,99 


4 


wt.gp160BX08 + 


pBluescript SK+ 


2,97 


1,42 


14,41 


11,44 


12,99 



A; No primary antibody added (control for unspecific secondary Ab binding) 
B: Neither Primary Ab nor Secondary Ab added (autoflouroscense control) 
C: Primary Ab and secondary Ab added. 



Example 6D Analyses of the surface expression and biological functionality 

To analyse the surface expression and biological functionality from the wild type and 
synthetic BX08 envelope genes transfection experiments were done and cell fusion 
microscopically studied using HIV envelope receptor expressing cells. 
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10 ng of the BX08 envelope plasmid (wt.BX08gp160 or syn.BX08gp160 or empty WRG7079 
vector plasmid) plus 5 ng of a plasmid (pEGFP, CJonetech) expressing green fluorescent 
protein (GFP) were transfected into 2 x 10 6 adherent U87.CD4.CCR5 cells (NIH AIDS Res. & 
Reference program, catalog #4035) stabely expressing CD4 and CCR5, using the CellPhect 
5 transfection kit (Pharmacia). After 48 hours the cells were examined by microscopy and 
photographed (Fig 22c). 

Fig 22c panel A show the negative control (empty WRG7079 plus pGFP) giving no syncytia. 
Panel B show cells transfected with the wild type BX08 gp160 plasmid where cell-to-cell 
fusion (syncytia) is seen. Panel C show cells transfected with the same amounts of 
10 synBX08gp160 plasmid and demonstrating a much higher degree of cell-cell fusion. In fact 
most or all of the cells in the culture plate were fused at this time. This experiment show 
surface expression of functional HIV gp160 with tropism to the CCR5 receptor, as well as a 
much higher expression and biological activity from the synthetic BX08 gene as compared to 
the wild type equivalent. 

15 

Example 7: Gene inoculation of mice for immunization 

6-7 weeks old female BALB/c mice were purchased from Bomholdtgaard T Denmak. 
Microbiological status was conventional and the mice were maintained in groups of 4/5 per 
cage with food and water ad libitum and artificially lighted 12 hours per day. Acclimatization 

20 period was 2 days. Mice were anaesthetized with 0.2 ml i.p. of rohypnol:stesolid (1:3, v/v) 
and DNA inoculated by either i.m. injection of 50 \x\ 2 mg/ml of plasmid DNA in each tibia 
anterior muscle at week 0 t 9, and 15 and terminated week 18; or gene gun inoculated on 
shaved abdominal skin using plasmid coated gold particles (0.95 urn particles, 2 ng DNA/mg 
gold, 0.5 mg gold/shot, 50-71% coating efficiency) with the hand held Helios® gene gun 

25 device (BioRad) employing compressed (400 psi) Helium as the particle motive force. Mice 
were gene gun vaccinated at week 0, 3, 6, 9, 15, and terminated week 18. 

Example 8: Serological assays 

Western blotting. The induction of a humoral response to gp120 and gp41 antigens by in 
30 vivo expression of the encoded glycoproteins from the synthetic BX08 genes was examined 
by western immuno blotting (Figure 27). Mouse antisera (1:40) were evaluated in western 
blotting using the commercial HIV BLOT 2.2 strips (Genelabs Diagnostic). The conjugate 
was a 1:200 dilution of the alkaline phosphatase-conjugated rabbit anti-mouse IgG 
(Dakopatts, Glostrup, Denmark). Buffers, incubation condition and colour development were 
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used according to the manufacturer. In these western blotting strips the gp160 band from 
HIV-1 1MB contain of an oligomeric form of gp41 in a higher concentration than the 
monomeric gp41 band on the strip (Genelabs Diagnostic). HIV-1 m B lysate is used in these 
commercial strips where the gp160 band is composed by addition of tetrameric gp41.AII 
5 preimmune sera tested negative in western blotting. Mice inoculated with syn.gp120 BX oe 
showed antibody response to the heterologous gp120 of HIV-1 NIB. Inclusion of the 
extracellular part of gp41 in the gene syn.gp140 B xo8 induced antibody reaction to both gp120 
and gp41 in all mice. This confirms the in vivo expression of BX08 gp120 and the 
extracellular part of gp41 . DNA vaccination with syn.gp160 B xoe encoding the membrane 
10 bound glycoprotein induced antibodies to gp120 and gp41 in 50% and 64% of the mice, 
respectively. DNA vaccination with syn.gp150 B xoe induced detectable antibodies to gp120 
and gp41 in 41% and 53%, respectively. Induction of different levels of antibodies could 
explain the difference in numbers of positive reactive mice sera in this qualitative western 
blotting. 

15 ELISA. Mouse anti HIV-1 gp120 antibodies were measured by indirect ELISA. Briefly, wells 
of polystyrene plates Maxisorb (Nunc) were coated for 2 days at room temperature with HIV- 
1 NIB recombinant gp120 (Intracel) at 0.2 ng/100 ^ of carbonate buffer, pH 9.6. Before use 
the plates were blocked 1 hour at room temperature with 150 ^l/well of washing buffer (PBS, 
0.5 M NaCI, 1% Triton-X-100) plus 2% BSA and 2% skimmilk powder. After 3 x 1 min. 

20 washings, mouse plasma was added at 100 ^l/well diluted in blocking buffer and ELISA 

plates incubated for 90 min. at room temperature using a microtiter plate shaker. As standard 
curve we used a mouse monoclonal antibody to a conserved part of gp120 between V5-C5 
(MRDNWRSELYKY) (#NEA-9301 f NEN™ Life Science Products, Inc., Boston, MA). As 
calibration control included on each plate we used a plasma pool from 10 mice vaccinated 

25 with BX08 gp120. Plates were again washed 5 x 1 min. and incubated 1 hour at room 
temperature with 100 ^I/well of HRP-conjugated rabbit anti-mouse IgG (#P260, Dakopatts, 
Glostrup, Denmark) diluted 1:1000 in blocking buffer. Colour was developed with 100 nl/weil 
of peroxidase enzyme substrate consisting of 4 mg of o-phenylenediamine in 1 1 ml water 
plus 4 hydrogen peroxide (30%, w/w). The enzyme reaction was terminated after 30 min 

30 by 150 nl/well of 1M H 2 S0 4 . The optical density (OD) of wells was measured at 492 nm using 
a microplate photometer (Molecular Devices, Biotech-Line, Denmark). Anti-HIV-gp120 IgG 
titers were expressed as the reciprocal plasma dilution resulting in an OD 49 2nm value of 0 500 
Mouse anti-HIV-1 BX08 antibodies were also measured by indirect peptide ELISAs as 
described above using a BX08 V3 peptide (SIHIGPGRAFYTTGD) (Schafer, Copenhagen 

35 Denmark). 
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The IgG antibody response to HIV-1 ui B rgp120 quantitated by ELISA is seen in Fig. 28 
and Fig. 29. No background activity was observed in preimmune sera or in sera from 4 mice 
immunized with empty WRG7079 vector in parallel with the BX08 genes. All mice inoculated 
with the synthetic BX08 genes either by gene gun or by i.m. injection responded and showed 
5 a persistent and high titered (about 100-10,000) IgG response to rgp120 as exemplified in 
Fig. 4. When comparing the median titers for groups of mice (Fig. 29) a moderate antibody 
response was observed with the wt.gpl 60 BX ob. Intramuscular and gene gun immunization 
with a mixture of wt.gpl 60 B xos plasmid plus Rev encoding plasmid did not increase this 
antibody response. This was found even when both plasmids were coated onto the same 

10 gold particles to ensure co-transfection of single target cells. However, to ensure inoculation 
of equal amounts of total DNA only half of the amount of wt.BX08 plasmid was used when 
mixing with pRev which may have contributed to the lower antibody response when pRev 
was included. A 5-fold improvement of the antibody response was obtained using the 
syn.gp160 B xoe gene. This antibody response seemed further improved using the 

1 5 syn.gpl 50 B xos gene where the cytoplasmic internalization signals were eliminated but only 
using gene gun inoculation. For both the gene gun inoculation of skin and i.m injection the 
highest antibody titers to rgp120 were induced by genes encoding secreted gp120/gp140 
glycoproteins versus membrane bound gp150/gp160 glycoproteins, respectively. In general, 
equal antibody and ELISA titers to rgp120 were obtained using gene gun and i.m. injection of 

20 the BX08 vaccine genes. 

Example 9: Neutralization assay 

Mouse plasma was diluted in culture medium (RPMI-1640 medium (Gibco) supplemented 
with antibiotics (Gibco), Nystatin (Gibco) and 10% FCS (Bodinco)) and heat inactivated at 
60°C for 30 min. Of the HIV-1 strain BX08 (50 TCID 50 per ml propagated in PBMC) 250 pi 

25 was incubated for 1 hour at room temperature with 250 pi dilution of mouse serum (four five- 
fold dilutions of mouse serum, final dilutions 1:20 to 1:2500). After incubation 1 x 10 6 PBMC 
in 500 pi culture medium was added to the virus-serum mixture and incubated overnight at 
37 tt C in 5% C0 2 . Subsequently, eight replicates of 10 5 PBMC in 200 pi culture medium were 
cultured in 96-well culture plates (Nunc) at 37°C in 5% C0 2 . After seven days in culture the 

30 concentration of HIV antigen in the culture supernatant was quantitated using HIV antigen 
detection ELISA (Nielsen et a!., 1987). 

This ELISA is performed using human IgG, purified from high titered patient sera, both as 
capture antibody and biotin-linked as detecting antibody. In brief, anti-HIV-capture IgG diluted 
1:4000 in PBS, 100 nl/well, are coated onto Immunopiates (Nunc) overnight at4 e C. After 
35 washing five times in washing buffer 100 pJ of supernatants are applied and incubated overnight 
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at 4°C. Plates are washed 5 times before incubation with 100 HIV-lgG conjugated with biotin 
diluted 1:1000 in dilution buffer, plus 10% HIV-1 seronegative human plasma for 3 hours at 
room temperature. Five times 1 min. washing in washing buffer are followed by 30 min. 
incubation with 100 of 1:1000 avidine-peroxidase (Dako P347 diluted in dilution buffer). Six 
5 times 1 min. washings, 5 in washing buffer and the last one are done in dH 2 0 before colour is 
developed with 100 ^il of peroxidase enzyme substrate consisting of 4 mg of OPD in 1 1 ml 
water plus 4 hydrogen peroxide (30 %, w/w). The enzyme reaction is terminated after 30 
minutes by additional 150 pA of 1M H 2 S0 4 . 

The HIV antigen concentration in cultures, preincubated with mouse serum, was expressed 
10 relatively to cultures without mouse serum (culture medium), and the percentage inhibitions 
of the different dilutions of mouse serum were calculated. The 50% inhibitory concentration 
(IC 50 ) for each mouse serum was determined by interpolation from the plots of percent 
inhibition versus the dilution of serum, and the neutralizing titer of the serum was expressed 
as the reciprocal value of the IC 5 o- In each set-up a human serum pool known to neutralise . 
15 other HIV-1 strains was included in the same dilutions as the mouse serum as a calibratin 
control. For assay of neutralization of the heterologous SHIV89.6P the MT-2-cell-killing 
format was used (Crawford et al., 1999). The assay stock of SHIV89.6P was grown in human 
PBMC. 

The neutralizing IC 5 o antibody titers of plasma pools from 10 mice from each group 
20 were measured at different time points (week 0, 9, and 18). A positive background in some 
preimmune sera and thus in all week 0 serum pools was noted even after dilution and heat 
inactivation that was found earlier to lower this background. In general the neutralizing titers 
to BX08 virus of such serum pools were transient and low ranging from 1 :6-1 :1 50 above 
background (data not shown). A possible cross-neutraiization reaction to a heterologous, 
25 primary HIV-1 envelope was tested using the SHIV89.6F which is relevant in macaque 
models of AIDS and serum pools from mice DNA immunized i.m. with syn.gp140 B xos- 
Preimmune serum had a titer of 1:37, which is indicative of a slightly positive background, 
whereas the 18 week p.i. serum had a positive neutralizing titer of 1:254 above background. 

30 Example 10: CTL assay 

The cellular immune response in mice following gene gun or i.m. genetic immunization with 
the different vaccine plasmids were examined (Fig. 26). Spleen was removed aseptically and 
gently homogenised to single cell suspension, washed 3 times in RPMI-1640 supplemented 
with 10% FCS and resuspended to a final concentration of 5 x 10 7 cell/ml. The cells were 
35 then incubated 5 days with mitomycin-C treated (50 jig/ml for 1 hour) mouse P815 (H-2D d ) 
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stimulator cells at a ratio of 10:1 in medium supplemented with 5 x 1 0" 5 M p- 
mercaptoethanol. For assay of CTL response to HIV-1 BX08, P815 stimulator cells and 
target cells were pulsed with 20 jag/ml of the HIV-1 BX08 V3 peptide containing a conserved 
murine H-2D d restricted CTL epitope (IGPGRAFYTT) (Lapham et al., 1996). After 
5 stimulation, splenocytes were washed three times with RPMI-1640 supplemented with 10% 
FCS and resuspended to a final concentration of 5 x 10 6 cells/ml. 100 nl of cell suspension 
was added in triplicate to U-bottom 96-well microtiter plates and a standard 4 hour 51 Cr- 
release assay performed (Marker et al., 1973). 

AH synthetic BX08 plasmids induced a high specific CTL response thus confirming the in vivo 
10 expression and in vivo immunogenicity. The highest CTL response was obtained with 
syn.gp150 B xoa followed by syn.gp120 BX o8-syn.gp140 B xo8- and syn.gp160 B xoe, respectively. 
Thus, the CTL response induced did not correlate with the antigen being secreted or not. 
However, i.m. DNA immunization with syn.gp150 BX 08 containing six putative CpG motifs 
induced a higher CTL response than gene gun immunization (Fig. 26). This difference could. 
1 5 be explained by the high amount of DNA used in the i.m. injections. 

The T-lymphocyte cytokine profile of spleen cells after ConA stimulation as well as serum 
antibody ^a/IgG, at week 18 were investigated. Neither the IFNy/IL-4 nor the lgG 2a /lgGi 
ratios, which both reflects a Th1-type of immune response, were significantly higher for the 
i.m. immunized mice when compared with gene gun immunized mice (student t-test and 
20 Mann-Withney U-test). Thus, the CTL response did not correlate with a certain Th-type of 
response and the DNA immunization technique did not bias the immune response using 
synthetic BX08 genes. 

Example 11: Antibody responses to DNA vaccination with synBX08 env plasmid 

25 A relatively low and variable antibody response (1 of 10 mice) was obtained with gene gun 
inoculation of the syn.gp140BX08 plasmid vaccine starting at week 9, figure 23, right panel. 
A higher numbers of responders 3/10 with high lgG1 antibody responses at an earlier onset 
(week 3-9) was obtained with the syn.gp140BX08 plasmid using i.m. injection, left panel. 
Sera from later time points may show more responders and/or higher titers but are not 

30 assayed. However, these results show the induction of an antibody response to the BX08 V3 
peptide by DNA vaccination using one of the described synthetic BX08 constructs. 
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Claims 

1 . A method for producing a nucleotide sequence construct comprising the following steps: 

a) obtaining a first nucleotide sequence of an HIV gene from a patient within the first 12 
5 months of infection; 

b) designing a second nucleotide sequence utilising the most frequent codons from 
mammalian highly expressed proteins to encode the same amino acid sequence as the 
first nucleotide sequence of a) encodes 

c) redesigning the second nucleotide sequence of b) so that restriction enzyme sites 

10 surrounds the regions of the nucleotide sequence which encode functional regions of the 
amino acid sequence and so that selected restriction enzyme sites are removed thereby 
obtaining a third nucleotide sequence encoding the same amino acid sequence as the 
first and the second nucleotide sequence of a) and b) encode; 

d) redesigning the third nucleotide sequence of c) so that the terminal snuts contain 
15 convenient restriction enzyme sites for cloning into an expression vehicle; 

e) producing the snuts between restriction enzyme sites of c) and terminal snuts of d); 

f) assembling the snut of step e) to form a nucleotide sequence construct. 

2. A method according to claim 1 , wherein the HIV gene is the gene encoding the envelope. 

20 

3. A method according to claim 1 or 2, wherein the HIV gene encodes one or more Gag 
proteins. 

4. A method according to any of the preceding claims, wherein the HIV in step a) is in group 
25 M.OorN 

5. A method according to claim 4, wherein the HIV is a group M virus. 

6. A method according to any of the preceding claims, wherein the HIV is subtype A, B, C, D. 

30 E, F, G, H, I. or J. 

7. A method according to claim 6, wherein the HIV is subtype B. 

8. A method according to any of the preceding claims wherein the first nucleotide sequence 
35 is obtained by direct cloning. 
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9. A method according to any of the preceding claims, wherein the HIV in step a) is isolated 
with the first 11 months of infection, such a 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or 0.5 month after 
infection. 

5 

10. A method according to any of the preceding claims, wherein the redesigning in step c) is 
carried out after the second nucleotide sequence of step b) has been divided into pieces, 
so that each piece comprises only different restriction enzyme sites. 

10 11. A method according to claim 1 0, wherein the second nucleotide sequence of step b) is 
divided into 9 pieces, or 8, or 7, or 6, or 5, or 4, or 3, or 2 pieces. 

12. A method according to claim 11, wherein the second nucleotide sequence of step b) is 
divided into 3 pieces. 

15 

13. A method according to any of the preceding claims, wherein the second nucleotide 
sequence of step b) is designed utilising the most frequent codons from human highly 
expressed proteins to encode the same amino acid sequence as the first nucleotide 
sequence of step a) encodes. 

20 

14. A nucleotide sequence construct obtainable by the method of any of claims 1-13. 

15. A nucleotide sequence construct according to claim 14, wherein the nucleotide sequence 
encoding the amino acid sequence in the first variable region is surrounded by EcoRV 

25 and Pst\ restriction enzyme sites. 

16. A nucleotide sequence construct according to claims 14 or 15, wherein the nucleotide 
sequence encoding the amino acid sequence in the second variable region is surrounded 
by Pst\ and C/al restriction enzyme sites. 

30 

17. A nucleotide sequence construct according to any of claims 14-16, wherein the 
nucleotide sequence encoding the amino acid sequence in the third variable region is 
surrounded by C/al and EcoRI restriction enzyme sites. 
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18. A nucleotide sequence construct according to any of claims 14-17, wherein the 

nucleotide sequence encoding the amino acid sequence in the transmembrane spanning 
region is surrounded by Hind\\\ and SacU restriction enzyme sites. 

5 19. A nucleotide sequence construct according to any of claims 14-18, wherein the 

nucleotide sequence encoding the amino acid sequence on both sites of the cleavage site 
is surrounded by Psfl and Xba\ restriction enzyme sites. 

20. A nucleotide sequence construct in isolated form which has a nucleotide sequence with 
10 the general formula (I), (II), (III), or (IV) or subsequences thereof 

(I) Pl~S495ClarS650-720EcoRrP2~Sl265gp120 

(II) Pl-S495ClarS 6 50-720EcoRrP2-S 12 65Xhor S 14 65Pstr P4gp140 

(III) Pl-S495ClarS 6 50-720EcoRrP2-S 12 65Xhor Sl465Pstr P4gp150 

(IV) Pl-S 4 95ClarS 6 50-720EcoRrP2-Si265Xhor S 146 5Pstr P4gp160- S 2 060Sacir Ps 

15 wherein P, designates the nucleotide sequence SEQ ID NO:41, a nucleotide sequence 

complementary thereto, or a nucleotide sequence with a sequence identity of at least 90% 
thereto; 

wherein S 495 ciai designates the nucleotide sequence SEQ ID NO: 7, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 
20 95%thereto; 

wherein S 650 .72oecori designates the nucleotide sequence SEQ ID NO: 9, a nucleotide 

sequence complementary thereto, or a nucleotide sequence with a sequence identity of at 
least 95% thereto; 

wherein P 2 designates the nucleotide sequence SEQ ID NO: 43, a nucleotide sequence 
25 complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% 
thereto; 

wherein S 126 5gpi2o designates the nucleotide sequence SEQ ID NO: 19, a nucleotide 

sequence complementary thereto, or a nucleotide sequence with a sequence identity of at 
feast 70% thereto; 

30 wherein S 1265 xhoi designates the nucleotide sequence SEQ ID NO: 17, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 80% 
thereto; 

wherein S 1465 p stt designates the nucleotide sequence SEQ ID NO: 23, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 90% 
35 thereto; 
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wherein P 4gp i4o designates the nucleotide sequence SEQ ID NO: 57, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% 
thereto; 

wherein 

P4gpi5o designates the nucleotide sequence SEQ ID NO; 55, a nucleotide sequence 
5 complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% 
thereto; 

wherein P 4gp i6o designates the nucleotide sequence SEQ ID NO: 53, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% 
thereto; 

10 wherein S 2 o6osacii designates the nucleotide sequence SEQ ID NO: 33, a nucleotide 

sequence complementary thereto, or a nucleotide sequence with a sequence identity of at 
least 98% thereto; and 
wherein P 5 designates the nucleotide sequence SEQ ID NO: 59, a nucleotide sequence 
complementary thereto, or a nucleotide sequence with a sequence identity of at least 85% 

15 thereto. 

21. A nucleotide sequence construct according to claim 20, with the formula (I) 

(I) Pl-S495ClarS650-720EcoRrP2-Si265gp120 

20 22. A nucleotide sequence construct according to claim 20, with the formula (II) 

(II) Pl-S495ClarS650-720EcoR|-P2-$i265Xhor Si485Pstl" P4gp140 

23. A nucleotide sequence construct according to claim 20, with the formula (III) 

(III) Pi-S 4 95ci a rS650-720EcoRrP2-Si265Xhor S 14 65Pstr P 4 gp150 

25 

24. A nucleotide sequence construct according to claim 20, with the formula (IV) 

(IV) Pl-S 4 95ClarS650-720EcoFtrP2-Si265Xhor Si465Pstr p4gp16CT S2060Sacir P5 

25. A nucleotide sequence construct according to claim 20 consisting essentially of the 
30 subsequence Pt. 

26. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence S 495C iai- 

35 27. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence S 6 5o-72oecori- 
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28. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence P 2 . 

5 29. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence S 126 5g P i2o- 

30. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence S 1265 xhoi. 

10 

31. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence S 14 65Psti- 

32. A nucleotide sequence construct according to claim 20 consisting essentially of the 
15 subsequence P 4gp i4o- 

33. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence P 4 g P i5o. 

20 34. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence P 4gp i6o. 

35. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence S 20 6osacii 

25 

36. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence P 5 . 

37. A nucleotide sequence construct with a sequence identity of more than 85% to the 
30 nucleotide sequence construct in any of claims 20-35. 

38. A nucleotide sequence construct according to claim 37, wherein the sequence identity i 
more than 90% such as more than 95%, 98%, or 99%. 

35 39. A nucleotide sequence construct according to claim 37, wherein the sequence identity i 
100%. 
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40. A nucleotide sequence construct according to any of claims 14-39, coding for an HIV 
envelope or parts thereof with an improved immunogenicity obtained by mutating the 
nucleotide sequence construct of any of claims 14-39 such that one or more glycosylation 

5 sites in the amino acid sequence have been removed. 

41 . A nucleotide sequence construct according to claim 40 with a mutation at positions 
A307C + C309A and/or A325C + C327G and/or A340C + C342A and/or A385C + C387A 
and/or A469C + C471A or any combination of those. 

10 

42. A nucleotide sequence construct according to any of claims 14-41 , coding for an HIV 
envelope or parts thereof with a binding site for the CXCR4 co-receptor in the third 
variable region. 

15 43. A nucleotide sequence construct according to claim 42 with a mutation at positions 
G865C + A866G. 

44. A nucleotide sequence construct according to any of claims 14-43, coding for an HIV 
envelope or parts thereof, wherein an immunodominant epitope has been modified. 

20 

45. A nucleotide sequence construct according to claim 44, wherein an immunodominant 
epitope in the third variable region has been modified. 

46. A nucleotide sequence construct according to claim 45 with a deletion of nucleotides 
25 793-897. 

47. A nucleotide sequence construct according to claim 44, wherein an immunodominant 
epitope has been removed from gp41. 

30 48. A nucleotide sequence construct according to any of claims 14-47, coding for an HIV 
envelope or parts thereof, wherein the cleavage site between gp41 and gp120 is 
removed. 

49. A nucleotide sequence construct according to claim 48 with a mutation at position 
35 C1423A. 
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50. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence P 1( S 4 g5ciai. S 6 5o-72oecori. an d p 2- 

51. A nucleotide sequence construct according to claim 20 consisting essentially of the 
5 subsequence S 12 6sxhoi. S 146 5pstr, anc * ^gpmo- 

52. A nucleotide sequence construct according to claim 20 consisting essentially of the 
subsequence Si2 65X hoi t S^espsth P4gpieo, S 2 o6osacii, and P 5 . 

10 53. A nucleotide sequence construct according to any of claims 14-52, further comprising a 
nucleotide sequence repeat coding for a functional region of the amino acid sequence. 

54. A nucleotide sequence construct according to claim 53, wherein the nucleotide sequence 
repeat codes for amino acids in the third variable region. 

15 

55. A nucleotide sequence construct according to any of claim 14-54, further comprising a 
nucleotide sequence coding for a T-helper cell epitope containing sequence. 

56. An expression vehicle selected from a group of viral vectors consisting of simliki forest 
20 virus (sfv), adenovirus and Modified Vaccinia Virus Ankara (MVA), further comprising a 

nucleotide sequence construct according to any of claim 14-55. 

57. A method of individualised immunotherapy wherein the virus from a newly diagnosed 
patient is directly cloned, the envelope is produced with highly expressed codons, inserted 

25 into any of the nucleotide sequence constructs of claims 14-55, and administered to the 
patient. 

58. Use of a nucleotide sequence construct according to any of claims 14-55 in medicine. 

30 59. Use of a nucleotide sequence according to any of claims 14-55 for the manufacture of a 
vaccine for the prophylactics of infection with HIV in humans. 

60. Use of a nucleotide sequence according to any of claims 14-55 for the manufacture of a 
composition for the treatment of an HIV infection in a human within 24 weeks of primary 
35 infection. 
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61 . Use of the nucleotide sequence according to any of claims 14-55 for the production of a 
recombinant protein. 
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SEQUENCE LISTING 
■■11)"> Statens Serun Institut 

:121 > Method for producing an nucleotide 
.-jquence construct with optimised codons for an HIV envelope 
cased on a primary, clinical HIV isolate and the BX08 
: instruct . 

■ 13:' > 21924DKI 

■ 16' • 7 6 

■ l' 1 ' FastSEQ for Windows Version 3.0 

:.:c- ■ i 

2 1 :. ■ 2 4 3 
212- DMA 

Artificial Sequence 



22 0 



CDS 

(1, . . . (243) 



• 400> 1 

get age qcg gec gac cgc ctj tgg gtg acc gtg tac tac ggc gtg ccc 

Ala Ser Ala Ala Asp Arg Leu Trp Val Thr Vai Tyr Tyr Gly Val Pro 
1 5 10 15 

gtg tgg aa? gac gec acc acc acc ctg ttc tgc gec age gac gec aag 

Val Trp Lys Asp Ala Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys 
20 25 30 

gec ta:: gar acc gag gtg cac aac gtg tgg gec acc cac gcg tgc gtg 

Ala Tyr Asc Thr Glu Val His Asn Val Trp Ala Thr His Ala Cys Val 
35 40 45 



48 



96 



144 



ccc ac: gac ccc aac ccc cag gag gtg gtg eta ggc aac gtg acc gag 19; 
Pro Thr Asp Pro Asn Pro Gin Glu Val Val Leu Gly Asn Val Thr Glu 
5,1 55 60 

aac tt: aac atg ggc aag aac aac atg gtg gag cag atg cac gag gat 
Asn Ph- Asn Met Gly Lys Asn Asn Met V.il Glu Gin Met His Glu Asp 

65 7 0 ^5 8Q 

ate 
lie 



24C 



< 4 3 



■ : 2 1 j • 2 
■■111- 81 
■:212 • PRT 

■:«:13' Artificial Sequence 
<:4 0X' 2 

a Ser Ala Ala Asp Arg Leu Trp Vai Thr Val Tyr Tvr Gly Val Pro 
5 10 ' 15 
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7a 1 Trp Lys Asp Ala Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys 

2 0 2 5 30 

Ala Tyr A.?p Trr Gin Val His Asn Val Trp Ala Thr His Ala Cys Val 

o: 4 0 * 4 5 

Pro Thr A.:p P:o Asr. Pro Gin Giu Val 7a 1 Lea Gly Asn Val Thr Giu 

50 5 5 60 

Asn Ph-j A.?r. Ms t Gly Lys Asn Asn Met Val Giu Gin Met His Giu Asp 



65 
He 



JO 



'5 



80 



143 
3NA 

Artificial Sequence 



1 ■ 



(143) 



gat at 

ASD II' 



age ctg tgg gac ^ag age ctg aag ccc tgc gtg aag ctg 
Ser Leu Trp Asp Gin Ser Leu Lys Pro Cys Vai Lys Leu 
5 10 15 



48 



acc ccc ctg tgc gtg ac: ctg aac tgc acc aag ctg aag aac age acc 
Thr Pr'-' L*-:-'. Cys Val Thr Leu Asn Cys Thr Lys Leu Lys Asn Ser Thr 

20 25 30 



96 



gac acc aac aac acc cg3 tgg ggc acc cag gag atg aag aac tgc ag 
Asp Thr As i. Asn Thr Ar j Trp Gly Thr Gin Giu Met Lys Asn Cys 

40 4 5 



143 



1:11 • 4? 

pp.t 

L 1 :'■ - Artificial Sequence 



Asp I1-: II- Ser Leu Tro Asp Gin Ser Leu Lys Pre Cys Val Lys Leu 

1 5 10 15 

Thr Pro L-u Cys Val Thr Leu Asn Cys Thr Lys Leu Lys Asn Ser Thr 

2C 25 30 

Asp Thr Asn Asn Thr Arg Trp Gly Thr Gin Giu Met Lys Asn Cys 
3 : , 4 0 4 5 



132 
DNA 



113 ■ Artificial Sequence 



{ 132) 



::ag ctt caa cat cag cac cag cgt gcg caa caa gat gaa gcg cga 
Gin Leu Gin His Gin His Gin Arg Ala Gin Gin Asp Giu Ala Arg 
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ib 10 15 

gta eg: czz gtt eta cag cct cga caz cgi ccc car cga caa cga caa 

Val Arj Pro Val Leu Gin Pro Gly His Ar.7 Ala His Arg Gin Arc: Gin 

20 25 30 

cac ca j c:j ccg cct gcg cag ctg cad cac ate gat 

His Gin Leu Pro Pro Aia Gin Leu Gin His lie Asp 

3 5 4 0 



■:210 - 6 
<21 1 - 4 4 
:212 > PRT 

'21 i • Artificial Sequence 



Leu 

7 
J. 

Val 
His 



G1:i L-i Gin His Gin His Gin Ar 



Aia Gin Gin Asp Gi; 



Ar i Pro Val Leu Gin Pro Gly His Arg Ala His 

2 0 2 5 ' 

Gl:-. L-.-u Pro Pro Ala Gin Leu Gin 
3 5 4 0 



Gin 



Ala Arg 
15 

Arg Gin 



nis ±j.e Aso 



1".H: • ' 7 

: .• 161 

■ ii i ■ DNA 
213:* Artificial Sequence 



:22 0: 

.1.21:- CDS 



:d ... (161) 



■ 4 0(> 7 

ate ga~ ca*' cac cca ggc ctg ccc 
He Asp Kir, His Pro Gly Leu Pro 



caa 
Gin 



ggt 
Gly 



gag 
Glu 



ctt 

Leu 



cga 
Arg 



gec 
Ala 



cat ccc 
His Pro 
15 



4S 



cat 
His 



cc-. ctt ctg cgc 



c cgc egg ctt 
^eu Arg Pro Arg Arg Leu 
20 25 



nig 



ca r. 
His 



Pro 



gaa 
Glu 



gtg 
Val 

3 0 



:aa caa 



caa gau ct * caa egg cac egg ccc ctg 
Gin Asp: Leu Gin Arg His Arg Pro Leu 
35 4 0 



His 



caa 
Gin 



cgt 
Ara 



gag 
Glu 

4 S 



His 



cgt gca 
Arg Ala 



144 



gtg cac cca egg aat tc 
Val His Pro Arg Asn 



PRT 

Artificial Sequence 



■ 400> S 

lie Asp Hie His Pro Glv Leu Pro Gin Giv Glu Le 



u arq rt^a His rr; 
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His Pro Lou Leu Arq Fro Arg Arg Leu Arg His Pro Giu Val Gin Gin 

20 25 ' 30 

Gin Asp Leu Gin Arg His Arg Pro Leu His Gin Arg Glu His Arg Ala 

35 4 0 4 5 

Val His Pro Arc Asn 



<L 1 - ' 



25 4 
ON A 

.Artificial Sequence 



:220 • 

- . - <i i " 

- .:. Z. I * 



- t 0 0 * 
?tV5 Ai t 



Trp 

5 



tga gca ccc age 
+ Ala Pro Ser 



qc : tga 
ys Cys T 
10 



Thr Ala 



tgg 
Trp 



■icq 

Pro 
i = 



ag ; ag ; a ;jg tag tea tea gat ctg aga act tea cca aca 
Ar j Ar ; Arg Trp * Ser Asp Leu Arg Thr Ser Pro Thr 

20 25 



Thr 



cca 
Pro 



aga cci tea teg tgc age tga acg aga gcg tgg aga tea act gca ccc 
Arg Pro Ser Ser Cys Ser * Thr Arg Ala Trp Arg Ser Thr Ala Pro 

30 35 40 



144 



gee zc-i ac,-. aca aca ccc gca aga gca tec aca teg gec ctg gee gcg 
Ala Pro Thr Tnr Thr Pro Ala Arg Ala Ser Thr Ser Ala Leu Ala Ala 
45 50 55 60 



192 



cct tct aca cca ccg gcg aca tea teg gcg aca tec gee agg ccc act 
Pro Ser Thr Fro Pro Ala Thr Ser Ser Ala Thr Ser Ala Arg Pro Thr 

65 7] 75 



240 



qca 
Ala 



Thr Se: 



eta 
-0 



ga 



•.in e>: 

• 212 • FP.T 

■ ' 21 1 3 : ■ Artificial Seauence 



■:4 00> 10 

Slu Pne Ala Pro Trp Ala Fro Ser Cys Cys Thr Ala Ala Trp Pro Arg 
5 10 15 

: Asp Leu Arg Thr Ser Pro Thr Thr Pro Arg Pro Ser 

25 30 
: Arg Ala Trp Arg Ser Thr Ala Fro Ala Pro Thr Thr 
4 0 4 5 

Ala Arg Ala Ser Thr Ser Ala Leu Ala Ala Fro Ser Thr Pro 
oO 55 60 

Fro Ala Thr Ser Ser Ala Thr Ser Ala Arg Pro Tnr A La Thr Ser Leu 
£5 7 0 "5 8 0 



Arg Arg 



Tnr Pro 



S o - 
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'211 - 92 
LI.. ' DIJA 

Jl.- • Artificial Seauence 




■ JOC • 

tct ac j ar : aic tgg acc aac acc ctg aag cgc gtg gC c gag aag ctg 

Ser A— Ihr Asn Trp Thr Asn Thr Leu Lys Arg Vai Ala Glu Lys Leu 

5 10 15 

cgc ga j aa: tic aac aac acc acc ate gtg ttc aac cag age t: 

Arg G: j Lv.> Pne Asn As- Thr Thr He Val Fhe Asn Gin Ser 

•=C 25 30 



48 



. i ; 1.: 

; i: • 30 

: ii ■ p;<t 

■ '1.' - Artificial Sequence 

• A 00 ■ 1J 

Ser Ar-j Thr Asn Trp Thr Asn Thr Leu Lys Arg Val Ala Glu Lys Leu 

1 5 io 15 

Arg Glu Ly-.; Phe Asn Asn Thr Thr He Val Phe Asn Gin Ser 
2'"' 25 30 

..io.- i:< 

- Jll • 130 

■ 12 ■ DMA 

-13 • Artificial Sequence 



• 2J.i ■ C ) ... (133) 

•mo - i:- 

gag ct - ;g j cag cga c;c cga gat cgt gat gca cag ctt caa ctg egg 
G ^ u Le '- ■■"*-■-) Ar 9 Arg Pro Arg Asp Arg Asp Ala Gin Leu Gin Lea Arg 

1 5 



1 n 10 



egg cgj gtt ct - : cta c:g caa cac cac cca gct gtt caa c ^ , „ ( 

Arg Arg Va L Leu Leu Leu Gin His His Pro Ala Val Gin Gin His Leu 

25 30 

gaa cg.i ga ; caa cag cga ggg caa cat cac tag t 13. 
Glu Ar: Asp Gin Gin Arg Giy Gin His His * 
3 z - 4 0 



1 0 ■ 14 

• az 

21.. • PRT 

Artificial Sequence 



:4 00 > 
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Glu leu Arg Arg Arg Pro Arg Asp Arg Asp Ala Gin Leu Gin Leu Arg 

1 5 10* 15 

Arg Arg Val Leu Leu Leu Gin His His Pro Ala Val Gin Gin His Leu 

2 0 2 5 3 C 

Glu Arg Asp Gin Gin Arg Gly Gin His His 
3 5 4 0 

<21Q> 15 

■ : : • 154 

- 2 1 2 » DNA 

'213* Artificial Sequence 



^221 ■ COS 

: 2 2 2 - ;:;...( 164 ) 

:4C^ ■ 1 ;■ 

act agt go: a :c ate acc ctg ccc tgc cgc ate aag cag ate ate aac 46 

Thr Ser Gly T-ir lie Thr Leu Pro Cys Arg He Lvs Gin He He Asn 
2 5 10 15 

atg tg.j cag gag qtg ggc aag gec atg tac gec ccc ccc ate gg;: ggc 96 
Met Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pro He Gly Gly 
20 25 30 

cag at-: aau tgc ctg age aac ate acc ggc ctg ctg ctg acc cgc gac 144 
Gin II--; Lyr-; Cys Leu Ser Asn He Thr Gly Leu Leu Leu Thr Arg Asp 
3 : 4 0 4 5 

ggc gg : age gac aac teg ag 164 
Gly Gly Se: Arp Asn Ser 
5-'i 



210 • lr, 

2 ! I • :: ■; 

:212> PP.T 

■213: Artificial Sequence 

• .;co:- lb 

Thr Ser- Gly Thr He Thr Leu Pro Cys Arg He Lys Gin He He Asn 

1 5 10 " 15 

Met Trp Gin G^u Val Gly Lys Ala Met Tyr Ala Pro Pro lie Glv Gly 

2 5 30 

Gin II*.-- Lys Cys Leu Ser Asn He Thr Gly Leu Leu Leu Thr Arg Asp 

35 40 " 45 

Gly Gly Ser A; p Asn Ser 
50 



2 00 
DMA 
Arti 



ficiai Sequence 
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7 
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etc Q8 5 cag egg caa gga gat ttt ccg ccc egg egg egg ega cat gcg 4 8 

Leu Glu Gin Arg Gin Giy Asp Phe Pro Pro Ara Arg Arg Arg His Ala 

1 5 :c 15 

cga caa c.g g:g cag :ga get gra caa gta caa ggt ggt gaa gat cga 96 

Arg Gin Leu Ala Gin Arg Ala Val Gin Vai Gin Giy Gly Giu Asp Arg 

20 25 ^ 30 

gee cct g :g cat cgc ecc cac caa ggc caa gcg ccg cat ggt gca gcg 144 

Ala Pr ) Giy His Arg Pro His Gin Giy Gin Ala Pro Arq Giy Ala Ala 

35 40 45 

rzrra ga'i g :g e^e egt ggg cat egg cgc- tat gtt cct egg ctt cct ggg 192 

Arg Gl-j A. a Arg Arg Civ His Arg Arg Tvr Val Pro Arg Leu Pro Giy 

^ 55 60 

cgc tg ; a ; 20C 
Arg Oy.; 

6 5 



10 ■ 1 i 
■:2 11 ■ 6-j 
■:212 • PRT 

■ ' - - 1 - ^ - Artificial Sequence 
:4 00 • 1H 

Leu Glu Gin Arg Gin Gly Asp Phe Pro Pro Arg Arg Arg Arg His Ala 

15 10 15 

Arg Gin Leu Ala Gin Arg Ala Val Gin Val Gin Giy Gly Glu Asp Arg 

20 25 30 

Ala ?r^ Gly His Arg Pro His Gin Gly Gin Ala Pro Arg Gly Ala Ala 

35 40 45 

Arg Glu Ala Arg Arg Giy His Arg Arg Tyr Val Pro Arg Leu Pro Gly 

50 55 60 

Arg Cys 
6 5 



i:.- 2:2 

:• DIIA 

:15- Artificial Sequence 



:?.c: 



CDS 

(1 ) , 



112) 



etc 
Leu 



-:4 30> 
gaa cag 
Glu Gin 



19 

egg caa gga gat ttt ccg ecc egg egg egg cga cat gcg 

Arg Gin Giy Asp Phe Pro Pro Arg Arg Arg Arg His Ala 

5 10 15 



cga 
Arg 



ca, : ctg 

Gin Leu 



gc 
A ^ 



cag cga get gta :aa gta caa ggt ggt gaa gat cga 
a Gin Arg Ala Val Gin Val Gin Giy Gly Giu Asp Arc 

25 30 



a 



cc* ggg cat cgc ccc cac caa }gc caa gcg ccg egt ggt gca gcg 
Pro Giy His Arg Pro His Gin Gly Gin Ala Pro Arg Gly Ala Ala 
2: 40 45 
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8 
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cga gaa gcg cgc 
Arg Glu Ala Arg 



;ta egg cat egg tgc tat gtt cct egg ctt cct ggg 
,eu 31/ His Arg Arg Tyr Val Pro Arg Leu ?r: -31 y 

55 ' 60 



192 



cgc tg: age ccg ggg gat cc 
Arg Cy; 3er Fro 31 y Asp 
6 5 * 7 0 



21; 



L0'- 20 
Li ■ -'0 



PRT 
Art i 



::al Secruence 



Leu 
i_ 

Arg 

Ala 

Arg 

Arg 
65 



■:.;iv • ::o 

Gl ; Gl: ; Arg 31 n GLy Asp Phe 
5 

Gl'i L-u Ala 3 It Ar : Ala Val 

■ . u 

Pr > Gly [lis Arg ?d His Gin 

3: 4 0 

Gl.i Alo Arg Leu GLy His Arg 
50 " 55 

Cy.-i l-er Pro GLy Asp 
7 0 



Pro 

10 

Val 



25 
Giv 



Arg Tyr 



Arg Arg Arg Arg His Ala 
15 

3 in Gly Giy Glu Asp Arg 
3C 

Ala Pro Arg Giy Ala Ala 

A 5 

Val Pro Arg Leu Pro Gly 

60 



-.210 

:2 1 1 



21 
2C0 
ON A 

Artificial Sequence 



:22C 

^ ■ : i 



COS 

■ 1) . . . {200 ' 



■ : 4 0 L ■ : 11 

etc gag ea:i egg :3a gga gat ttt 

Leu Gl i Giii Arg Gin Gly Asp Phe 

1 5 



:cg ccc egg egg egg cga cat gcg 
Pro Pro Arg Arg Arg Arg His Ala 
10 15 



48 



cga ca.i gcg cag cga get gta 

Arg Gin Leu Ala Gin Arg Ala Val 



:aa gta caa ggt ggt gaa gat cga 
Gin Val Gin Gly Gly Glu Asp Arg 

2 5 30 



96 



cct gg<j eat eg: 
Pro Glv His Arc 



cac caa 
His Gin 
■3 0 



ggc 
Glv 



caa geg eeg egt ggt gca gcg 
Gin Ala Pre- Arg Gly Ala Ala 
4 5 



114 



c g a 
Arg 



gai gat; cgc cgt gg : cat egg cc 



Glu G:u Arg Arg Giy His Arg 
5 " 5 5 



Arq 



gtt cct ccg ctt cct ggg 
Val Pro Arg Leu ?r: Gly 
60 



:g : <• : 



/*.rq 
o5 
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3 1 1 > 66 
■Cli> PPT 

-13- Artificial Sequence 

4 0 ') • 2 Z 

Leu Gl - G'.. Arg Gin Gly Asp Phe Pro Pro Arg Arg Arg Arg His Ala 

5 10 15 

Arg Glr. Ala Gin Arg Aia Val Gin VaL Gin Gly Gly Glu Asp Arg 

20 25 30 

Ala Pr ■ G .. / His Arg Pro His Gin Gly Gin Ala Pro Arg Gly Ala Ala 

40 45 
Arg Gl , G.. . Aig Arg Gly His Arg Arg Tyr Val Pro Arg Leu Pro Gly 

5 C 5 5 6 0 

Ara 3 v.' 



17 8 
3NA 

Artificial Seauence 



22: ■ ::s 
■ 12. I ! . . . ! 1 7 8 ) 

• 4 00. ■ 2 3 

ctg ca; gca jc a cca tgg gcg ccg cc a gcc tga ccc tga ccg tgc agg 48 

Leu Gin A.. ,i Aia Pro Trp Ala Pro Pro Ala * Pro * Pro Cys Arg 

15 10 

ccc gc ; age *.;g; tga gcg gca teg tgc age age aga aca acc t gc tgc 96 

Pro Ale Sc-r Cys * Ala Ala Ser Cys Ser Ser Arg Thr Thr Cys Cys 

15 20 2 5 

gcg :cj tr:; -gg ccc age age ace tgc tec age tga ^cg tgt ggg gca 14 4 

Aia ?r- S(-r Arg Pre Ser Ser Thr Cys Ser Ser * Pro Cys Gly Ala 

3 0 3 5 4 0 

tea age ar.c 'c: agg ccc gcg tgc tgg etc tag a 176 

Ser Ser Sr-r Oer Arg Pro Aia Cys Trp Leu 

4 5 5 0 



213:- 24 
211:- 54 
2K> PPT 

213:- Artificial Sequence 



- : 4 CM i : - 34 

Leu Gin Ala Ala Pre Trp Ala Pro Pr: A La Pro Pro Cys Arg Fro Ala 

1 5 10 " 15 

Ser Cys Alii Ala Ser Cys Ser Ser Arg Thr Thr Cys Cys Ala Pro Ser 

^0 ' 2 5 3 0 

Arg Pro Sav Ser Thr Cys Ser Ser Pro Cvs Glv Aia Ser Ser Ser Ser 

35 '40 ' 45 

Arg Pro Aid Cvs Trp Leu 
50 
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2U> 178 
2ir> LNA 

-1-.' * Artificial Sequence 



■ 22: > cos 

■ 122 • (i) . . . ;i78) 

■ 4CC - 25 

ctg ca; :;c3 gca cca tgg gcg ccg cca gcc tga ccc tga ccg tgc agg 4 

Leu Glr. All Ala Pro Trp Ala Pro Pro Ala * Pro * Pro Zys Ara 

1 5 io 

ccc gc: acr: tgc tga gcg gca teg tgc age age aga aca acc tgc tgc 9 

Pro AI 1 ;er Zys - Ala Ala Ser Cys Ser Ser Arg Thr Thr Cys Cys 

15 20 25 

gcg cc 1 t:-? agg ccc age age acc tgc tec age tea ccg tgt ggg gca 14 

Ala ?r- Arg Pre S*r Ser Thr Cys Ser Ser * Pro Cys Gly Ala 

30 3.5 4C 

tea ag: a :t g-t gcg gcc gcg tgc tgg etc tag a 17 

Ser Ser S-rr Ala Ala Ala Ala Cys Trp Leu - 

45 50 



• 2i: • 26 

• 2 1 1 54 

•212' PRT 

•213- Artificial Sequence 
-400^ 26 

Leu Glr: An Ala Pre Trp Ala Pro Pro Ala Pro Pro Cys Arg P^o Ala 

1 5 10 15 

Ser Cys Ai3 Ala Ser Cys Ser Ser Arg Thr Thr Cys Cys Ala Pro Ser 

20 25 30 

Arg Pr;. 3-r Ser Thr Cys Ser Ser Pro Cys Gly Ala Ser Ser Ser Ala 

35 40 4 5 

Ala All Ala Cys Trp Leu 
50 



-211^ 77 

212> DNA 
--21.V- Artificial Sequence 

• 220- 

• 221 * CDS 

'-222 - ( 1) ... (77) 

<:400 ' 27 

tet aga gcg eta cct cca gga cea gcg ctt cct ggg cat gtg ggg ctg 4- 

Ser Arg Ala Leu Pro Pre Gly Pro Ala Leu Pro Gly His Val Gly Leu 

^ 10 15 

etc egg caa get gat ctg cac cac ggc eg 
Leu Arg Gin Ala Asp Leu His His Glv 

20 25 
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210> 2$ 
111:- 2 5 
kf-.t 

213:- A:t:f icial Sequence 



• -;oo; : 

Ser Ar>; A-.f: leu Pre Pro Gly Pro Ala Leu Pro Gly His Val Gly Leu 

1 5 1C 15 

Leu Ar ; G .. : A., a Aso Leu His His Gly 

rr 25 



; ■«(> 
hja 

.-rtificial Sequence 



190; 



egg cci tg>: '-ct gga acg cca get gga gca aca aga acc tga gec aga 48 
Arg Pri Cv: ::c Gly Thr Pro Ala Gly Ala Thr Arg Thr * Ala Arg 
1 ~ 5 10 15 



ttt gg-^ aci .ica tga cct gga tgg agt ggg age gcg aga tea gca act 96 
Phe Glv Thr Thr * Pro Gly Trp Ser Gly Ser Ala Arg Ser Ala Thr 

20 25 30 



aca cc-:i ag 3 r . .:a tct aca gec tga teg agg aga gee aga acc age agg 14 4 

Thr Pr-:- Ar .j . : .->r Ser Thr Ala * Ser Arg Arg Ala Arg Thr Ser Arg 
3 5 4 0 4 5 



aga aga a 24 i-.:c tgg acc tgc tec age tgg aca agt ggg caa get t 190 
Arg Ar : Thr ,-r Trp Thr Cys Ser Ser Trp Thr Ser Gly Gin Ala 

50 5 5 60 



Artificial Sequence 
- 4 GO • :-.o 

Arc; Pr:- Zys- fro Gly Thr Pro Ala Gly Ala Thr Arg Thr Ala Arg Phe 

r 5 10 15 

Gly Thr Thr Pro Glv Trp Ser Gly Ser Ala Arg Ser Ala Thr Thr Pro 

' 25 30 

Arg Ser Ser Tnr Ala Ser Arg Arg Ala Arg Thr Ser Arg Arg Arg Thr 

35 40 45 

Ser Trp Thr ; ;s Ser Ser Trp Thr Ser Gly Gin Ala 
5C 5 5 60 

■:210 - J 1 
■:211 ■ 1' 7 7 
■:212 • CI] A 

.21 3.* Artificial Sequence 
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■:220 > 
■:221 - CDS 



J . . . t :. / 



•:40 ^ > 31 

aag ctt jr. ; gaa ctg gtt caa cat cac caa ctg get gtg g^a cat 

Lys Lei Val Glu Leu Val Gin His His Gin Leu Ala 7a 1 7ai His 

i 5 10 15 



caa 
Gin 



gat ttt -rat cat gat cgt ggg egg c:t gat. egg eet gcg cat cgt gtt 
Asp Pho His His Asp Arg 31y Arg Pro Asp Arg Pro Ala His Arg 7ai 

20 25 30 



cac cgt -j-:r gag cat cqt gaa ccg cgt gcg cca ggg eta cag ccc 
His Arg Ai.i GLu His Arg Giu Pro Arg Ala Pro Gly Leu Gin Pro 
:i r : 40 ' 45 



1-1 



gag etc c .: i gic ccg eet gee cgt gee cog egg 
Glu Lea Pn A.id Pro Pro Ala Arg Ala Pro Arc 
50 55 



2 10 ■ .32 

211.- 5} 

2 12 • PRT 

21,;-* Artificial Sequence 



■ 4 00 * 32 

Lys Leu Va 1 GLu Leu Val Gin His His Glr. Leu Ala Val Val His Glr. 

15 10 15 

Asp Pho Has His Asp Arg Gly Arg Pro Asp Arg Pro Ala His Arg Val 

2n 25 30 

His Ar; Ala Giu His Arg Glu Pro Arg Ala Fro Gly Leu Gin Pro Pro 

35 4 0 4 5 

Glu Leu Pro A;^d Pro Pro Ala Arg Ala Pro Arg 
50 55 

■ 210 - 33 

211 • 14 0 
2] 2 • DIIA 

212 • Artificial Sequence 

*-220> 

-■211> CDS 

■:22l % > ( 1 ) . . . (140) 

- 4 00;- 3."* 

ccg egg ccc cga eeg ccc ega ggg cat cga gga gga ggg egg cga gcg 
Pro Ar: Pro Arg Pro Pro Arg Gly His Arc Gly Gly Gly Arg Arg Ala 



cga cci cga ccg cag cac ccg cct ggt gac egg ctt eet gee cc: gat 
Arg Pr j Arg Pro Gin His Pro Pro Gly Asp Arg Leu Pro Ala Pro Asp 

20 2 5 30 



ctg ggi cca cct gcg cag ccz gtt cct gtt cag eta cca teg at 
Leu Gly Arg Pro Ala Gin Pro Val Pro Val Gin Leu Pre Ser 
5 4 0 4 5 
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<:ie> 34 

■'-11' 4 *5 
■'2 12 - PR? 

'J13 ■ Artificial Sequence 
:-;CG ■ 3 1 

Pro Ar j ?r :> Arg Pro Pro Arg Gly His Ara Gly Gly Giv 

5 10 ' 

Arq Fr j Ar : Pro Gin His Pro Pro Gly Asp Arg Leu Pro 

20 25 
Leu Giv Ar : Pro Ala Gin Pro Val Pro 7a I Gin Leu Pro 
^ 4 0 4 5 



Arg Arg Ala 
15 

Ala Pro Asp 

30 



2 11 ■ 129 
2 12- DtIA 

2.3 * Artificial Sequence 



2 20 



::)... (129) 



ate 
lie 
I 

Trp 



■M'.'O • 3^ 

gat tg~ gcg acc tgc tgc tga teg tgg ccc gca 

Asp Cys Ala Thr Cys Cys * Ser Tro Pro Ala 

5 10 

gc i 7gj gcg get ggg aga tec tga agt act ggt 

Ala Gly Ala Ala Gly Arg Ser * Ser Thr Gly 

20 25 



teg 
Ser 



gga 

Gly 



tgg age tgc 
Trp Ser Cys 
15 

acc tgc tec 
Thr Cys Ser 
30 



48 



96 



agt act jg.i gee agg age tga aga act ctg cag 
Ser Thr Gly Ala Arg Ser + Arg Thr Leu Gin 
35 4 0 



129 



2 1. 0 • 3 fc 
211 • 40 
212- PPT 

2.3- Artificial Sequence 



• 400.- 36 

He Asp Cys Ala Thr Cys Cys Ser Trp Pro Ala Ser Tro Ser Cys Trp 

15 10 15 

Ala Gly Ai.-i Ala Gly Arg Ser Ser Thr Gly Gly Thr Cys Ser Ser Thr 

20 25 30 

Gly Ala Arg Ser Arg Thr Leu Gin 
; 5 4 0 



• : 2 : 1 



37 

114 

DKA 



• ' : • Artificial Sequence 



CDS 

U: ... (114) 
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<400 > 



ctg ca:j tga ccc tgc tga acg cca ccg cca teg ccg tgg ccg agg gca 48 
Leu Gin - Ala Cys " Thr Pro Pro Pro Ser Pro Trp Pro Arg Ala 



1 



10 



ccg acc qc. j t ga teg agg tgg tgc age gca tct ggc gcg gca tee tgc 96 

Pro Thr Ala * Ser Arg Trp Cys Ser Ala Ser Gly Ala Ala Ser Cys 

15 20 25 

aca tc. : cca ccc gaa ttc ]_ i 4 

Thr S<-r Pr > Pro Glu Phe 

30 35 



1 C > 3 6 
211 ■ 35 
■ 212 ■ F RT 
2 1 j ■ Artificial Sequence 

• -iOC • 3 3 

Leu Glr. Al * Cys Thr Pro Pro Pro Ser Pro Trp Pro Arg Ala Pro Tnr 

15 :c 15 

Ala Ser Arg Trp Cys Ser Ala Ser Gly Ala Ala Ser Cys Thr Ser Pro 
20 25 30 

Pro G1j Phe 





35 






- 210 • 


30 




- 211 - 


41 




• 2 1 2 ■ 


DNA 




■ 213 • 


Artificial Sequence 




220 - 






■ ^2 1 - 


CDS 




.222 * 


1 1 } . . . ( 4 1 ) 




4 00 ■ 


3 3 


gaa 


tt: go - : 


agg get teg age gcg 


Glu 


Ph- Ala 


Arg Ala Ser Ser Ala 


1 







:gc tgt aag gat cc 4 1 

"ys Cys Lys Asp 
10 



■ 210 - 40 
211- 13 

• 212 • PRT 

•'213 - Artificial Sequence 

■ 4 00 • 4 0 

Glu Phe Ala Arg Ala Ser Ser Ala Pro Cys Cys Lys Asp 

1 5 1C 

210:- 4 1 

■ 211:- 506 
212:- DNA 

213:- Artificial Sequence 

220:- 

■ 221> CDS 

• 22 2 - [ 1 ; . . . ■ 506) 
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4 00 > 41 

get ag: go.j gec gac cgc ctg t gg gtg acc gtg :ac tac ggc gtg ccc 43 

Ala Ser All Ala Asp Arg Leu Trp Val Thr Val Tyr Tyr Giy Val Pre 

1 5 10 15 

gtg tg j ai j gar gec acc acc arc ctg etc tg: gec age gac gec aag 96 

Val Trp L/.i Asp Ala Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys 

2D 2 5 30 

g:: ta : ga : acc gag gtg cac aac gtg tgg gec acc cac gcg tgc gtg 144 

Ala Tyr Acp Thr Olu Val His Aon ^'al Trp Ala Thr His Ala Cys Val 

V) 4 D 4 5 

c:: ac : gac ccz aac ccc cag gag gtg gtg ctg ggc aac gtg acc gag 192 

Pro Thr Acp Pro Asn Pro Gin Glu Val Val Lea Gly Asn Val Thr Glu 

5 1 5 5 6 ] 

aac tt: a<;e atg ggc aag aac aac atg gtg gag cag atg cac gag gat 240 

Asn Phe A.sri Met Gly Lys Asn Asn Met Val Glu Gin Met His Glu Asp 

65 7 0 7 5 8 0 

ate at:: auo :tg tgg gac cag age ctg aag ccc tgc gtg aag ctg acc 288 

lie lie Ser Leu Trp Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr 

8 5 90 95 

ccc ct i tec gtg acc ctg aac tgc acc aag ctg aag aac age acc gac 336 

Pro Leu Cys Val Thr Leu Asn Cys Thr Lys Leu Lys Asn Ser Thr Asp 

100 105 110 

acc aac aac ace eg z tgg ggc acc cag gag atg aag aac tgc age ttc 384 

Thr Asn Asr. Thr Arg Trp Gly Thr Gin Glu Met Lys Asn Cys Ser Phe 

115 120 125 

aac at.: acc ace ac: gtg :gc aac aag atg aag cgc gag tac gee ctg 4 32 

Asn II-j S<-- Thr Ser Val Arg Asn Lys Met Lys Arg Glu Tyr Ala Leu 

130 135 140 

ttc ta: aac ctg gac ate gtg ccc ate gac aac gac aac acc age tac 4 80 

Phe Tyr Ser Leu Asp lie Vai Pro lie Asp Asn Asp Asn Thr Ser Tyr 

145 " 150 15c 16j 

cgc ctg cgc age tgc aac aca teg at 506 

Arg Leu Arg Ser Cys Asn Thr Ser 



1 



210:- 

2 1 : > 



42 

163 

PRT 



21.V- Artificial Sequence 
:4 0m> 4 2 

Ala Ser A., a Ala Asp Arg Leu Trp Vai Thr Val Tyr Tyr Giy Val Pro 

1 5 10 15 

Val Trp Lys Asp Ala Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys 

2 0 2 5 30 

Ala Tyr Acp Thr Gl j Val His Asn Val Trp Ala Thr His Ala Cys Val 



3 5 



4 3 



4 5 
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Pro Thr Asp Pro Asn Pro Gin Giu Val Val Leu Giy Asn Val Thr Giu 

50 55 60 

Asn Phe A-n Met Gly Lys Asn Asn Met Val GIu Gin Met His GIu Asp 
65 70 75 80 

He lie Ser Leu Trp Asp Gin Ser Leu Lys Pro Cvs Val Lys Leu Thr 

8 5 90 * 95 

Pro Leu I'ys Val Thr Leu Asn Cys Thr Lys Leu Lys Asn Ser Thr Asp 

100 105 HO 

Thr Asn Asn Thr Arg Trp Giy Thr Gin GIu Met Lys Asn Cys Ser Phe 

— 5 120 * 125 

Asn lie Ser Thr Ser Val Arg Asn Lys Met Lys Ara GIu Tyr Ala Leu 
l-'O 135 14 5 

Pi-re Tyr Se; Leu Asp lie Val Pro Il-e~Asp Asn Asp Asn Thr Ser Tvr 
145 150 155 * 160 

Arg Leu Ar .j Ser Cys Asn Thr Ser 

165 

:2:.0 ■ 4 3 
■CJil ■ :--74 
■■212 DNA 

■ T-j Artificial Sequence 

•:22C • 

•.221 ■ CDS 

-~222 • (1) . . . (374) 

<400 > 43 

tct aga ace aac tgg acc aac acc ctg aag cgc gtg gec gag aag ctg 48 

Ser Arg Thr Asn Trp Thr Asn Thr Leu Lys Arg Val Ala GIu Lys Leu 
1 5 10 15 

cgc gag aag ttc aac aac acc acc ate gtg ttc aac cag age tec ggc 96 
Arg GIu Lys Phe Asn Asn Thr Thr He Val Phe Asn Gin Ser Ser Gly 
20 25 30 

ggc gac :cc gag ate gtg atg cac age tt: aac tgc ggc ggc gag ttc 144 
Gly Asp Pro GIu He Val Met His Ser Phe Asn Cys Gly Gly GIu Phe 
35 40 4 5 

ttc tac tgc aac acc acc cag ctg ttc aa: age acc tgg aac gag acc 192 
Phe Tyr Zys Asn Thr Thr Gin Leu Phe Asn Ser Thr Trp Asn GIu Thr 
50 55 60 

aa- age gag ggc aac ate act agt ggc acc ate acc ctg ccc tgc cgc 24 0 

Asn Ser GIu Gly Asn He Thr Ser Gly Thr lie Thr Leu Pre Cys Arg 

65 7 0 7 5 8 0 

ate aag cag ate ate aac atg tgg cag gag gtg ggc aag gec atg tac 283 
He Lys Gin lie lie Asn Met Trp Gin GIu Val Gly Lys Ala Met Tyr 
85 90 ' 95 

gec ccc ccc ate ggo ggc cag ate aag tgc ctg age aac ate acc gec 3 3 -i 

Ala Pro Pro lie Gly Gly Gin He Lys Cys Leu Ser Asn lie Thr Giv 
130 105 no 

ctg ctg ctg acc cgc gac ggc ggc age gac aac teg ag 
Leu Leu Leu Thr Arg Asp Gly Gly Ser Asp Asn Ser 
11: 120 
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:210 > 44 
■:21i • 124 
■"212 • PRT 

■:213- Artificial Sequence 
■ : ■; 0 0 • 4 1 

Ser Ara Thr Asn Trp Thr Asn Thr Leu Lys Arg Val Ala jIu Lys Leu 

Arg 3i-i Lys Pr.e Asn Asn Thr Thr lie Val Phe Asn Gin Ser Ser Gly 

2 0 2 5 3 C 

Gl v As:.' ?: j G.la lie Val Met His Ser Phe Asn Cys Gly Gly Glu Phe 

.3 5 4 C 4 5 

Fhe Ty: Gy - A." r. Thr Thr Gin Leu Phe Asn Ser Thr Trp Asn Glu Thr 

5 0 5 5 60 

Asn Ser Gl i G.; y Asn lie Thr Ser Gly Thr He Thr Leu Pro Cys Arg 
6 5 7 0 7 5 8 0 

lie Ly- G L ■ t lie He Asn Met Trp Gin Glu Val Gly Lys Ala Met Tyr 

S5 * 90 ' 35 

Ala ?r< Pr : He Gly Gly Gin lie Lys Cys Leu Ser Asn lie Thr Gly 

l!"0 105 110 

Leu Leu Leu Thr Arg Asp Gly Gly Ser Asp Asn Ser 
110 120 

'210* 4 ; 

■ 211 • 127 7 
• 212 ■ DNA 

■ 213 * Artificial Sequence 



220 



:ds 

(1 ) . . . (1277) 



■ 4 00 - 4 5 

get a jc go; gec gac cgc ctg tgg gtg acc: gtg tac tac age gtg ccc 4S 

Ala Ser Ala A: a Asp Arg Leu Trp Val Thr Val Tyr Tyr Gly Val Pro 

i 5' :o 15 

gtg tg:.i aag gac gec acc acc acc ctg ttc tgc gec age gac gec aag 96 

Val Trp Lys Asp Ala Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys 

2 0 2 5 3 0 

ccc tac gac acc gag gtg cac aac gtg tgg gec acc cac gcg tgc gtg 144 

Ala Tyr Asp Thr Glu Val His Asn Val Trp Ala Thr His Ala Cys Val 

3 5 4 0 4 5 

ccc acc gac ccc aac cc: cag gag gtg gtg ctg ggc aac gtg acc gag 1 --2 

Pro Thr Asp Pro Asn Pr3 Gin Glu Val Val Leu Gly Asn Val Thr Glu 

50 55 60 

aac ttc aac atg ggc aag aac aac atg gtg gag cag atg :ac gag gat 2 40 

Asn Phe Asn Met Gly Lys Asn Asn Met Val Glu G_n Met His Glu Asp 

6 5 7 3 7 5 9 0 

ate ate age ctg tgg gar cag age :tg aag sec t :]c gtg aag ctg acc 

He lie Ser Leu Trp Asp Gin Ser Leu Lys Pro Zys Val Lys Leu Thr 

8 5 9 ?■ 9 5 

ccc ctg tac eta acc eta aac tgc ace aac ctg aag aac age acc gac 
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Pro Leu Gys Val Thr Leu Asn Cys Thr Lys Leu Lys Asn Ser Thr Asp 
100 1)5 110 

= aac aac acc :cc tgg ggc -ace cag gag atg aag aac tgc age ttc 384 

Thr Asn Asn Thr Arg Trp Gly Thr Gin Slu Met Lys Ash Gys Ser Phe 

115 12C ' 125 

aac ate age ac: age gtg cgc aac aag atg aag cgc gag ta: gec ctg 4 32 

Asn lie Ser Thr .Ser Val Arg Asn L/s V'et Lys Arg Glu Tyr Ala Leu 

-30 135 " 14 0 

ttc tac age ctg gac ate gtg ccc ate gac aac :ac aac ac: age ta: 430 

^he Tyr Ser Leu Asp lie Val Pro ILe^-Asp Asn Asp Asn Thr Ser Tyr 

145 150 155 163 

cgc ctg cgc ag : tgc aac aca teg at: ate ace cag gee eg: :ce aag 529 

Arg Leu Arg See Gys Asn Thr Ser Le He Thr Gin Ala Gys Pro Lys 

16 5 17 0 17 5 

gtg age ttc gag ccc ate ccc ate car ttc tgc gec ::c gec ggc tt: 5G6 

Val Ser Phe Glu Pr- lie Pro lie Hls Phe Oys Ala Pro Ala Oly Phe 
18) 135 ' 190 

gee ate :tg aag -.g: aac aac aag ace tt: aac ggc ate ggc c:: tgc 6-2 4 

Ala lie Leu Lye Cys Asn Asn Lys Thr Phe Asn Gly Tnr Gly Pro Cys 

195 200 205 

acc aac gtg ag.: ac: gtg cag tgc ac: oa : gga att :gc ccc gtg gtg 672 

Thr Asn VaL Ser Thr Val Gin Cys Thr His Gly He Arg Pro Val Val 

210 215 220 

age acc cag ctg ctg ctg aac ggc age ctg gec gag gag gag gtg gtg 720 

Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu GLu Glu Val Val 

225 230 235 240 

ate aga net gag aa : ttc acc aa: aac gec aag acc ate ate gtg cag 7 68 

He Arg Ser Glu Asn Phe Thr Asn Asn Ala Lys Thr lie lie Val Gin 

24 5 250 2 55 

ctg aac gag age gtg gag ate aac tgc ace :gc cc: aac aac aac ace 316 

Leu Asn Glu Ser Val Glu He Asn Cys Thr Arg ?r d Asn Asn Asn Thr 
2 6 0 2 6 5 2 7 0 

cgc aag age ate cac ate ggc cct ggc cgc gec tt: tac ace acc ggc 964 

Arg Lys Ser He Hie He Gly Pro Gly Arg Ala Phe Tyr Thr Thr Gly 

2 7 5 2 8 0 23 5 

gac ate ate ggc gac ate cgc cag gec cac tgc aa : ate cct aga ace 912 

Asp He lie Gly Asp He Arg Gin A...a His Gys Asn He Ser Arg Thr 

29C 295 300 

aa: tgg acc aac acc ctg aag cgc gtg gec gag aag ctg ege gag aag 9 60' 

Asn Trp. Thr Asn Thr Leu Lys Arg Val Ala Glu Lys Leu Arq Glu Lys 

305 310 315 " 32: 

ttc aac aac acc aee ate gtg ttc aac cag age tc: ggc gqc gac cc : 

Phe Asn Asn Thr Thr lie Val Phe Asn Gin Ser Ser Gly Gly Asp Pro 

325 3 30 335 
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gag ate gtg atg oac aqc ttc aa: tgc ggc ggc gag ttc ttc tac tgc 1056 

GLu He Val Met His Ser Phe Asn Oys Giy Gly Glu Phe Phe Tyr Cys 
340 345 " 350 

aac ac: ac: cag ctg :*:c aac age acc tag aac gag acc aac age gag 1104 

Asn Thr Thr Gin Leu Phe Asn Ser Thr Trp Asn Glu Thr Asn Ser Glu 

555 360 365 

ggc aa: air act agt ggc acc ate acc ctg ccc tgc cgc ate uag cag 1152 

Gly Asn 1 1 e Thr Ser Gl.y Thr lie Thr Leu Pro Cys Arg lie Lys Glr. 

3 n : 375 380 

arc at:: a.:.:: atg tag cag gag gtg ggrraag gec atg tac gec ccc ccc 1200 

I Le lie A;:. Met Trp G~n Glu VaL Gly Lys Ala Met Tyr Ala Pro Pre 

335 3 90 3 95 4 Of 

ate ggc ggc cag ate aag tgc ctg age aac ate acc ggc ctg ctg ctg 1243 

I Le Giy jiy Gin lie Lys Gys Leu Ser Asn lie Thr Gly Leu Leu Leu 
4 0 5 4 10 4 15 

acc eg: aac ggc ggc age gac .ac teg ag 12^" 

Thr Arg A:ir Gly Gly Ser Asp Asn Ser 
420 425 



■;210:- 46 
•0: i: • 4 25 

■:2i::> PRT 

■:215c Artificial Sequence 
•:4CC :- 46 

Ala Ser A.; a Ala Asp Arg Leu Trp Val Thr Val Tyr Tyr Gly Val Pro 

1 5 10 15 

Val Trp Lyi Asp Ala Thr Thr Thr Leu Phe Gys Ala Ser Asp Ala Lys 

20 2d 30 

Ala Tyr A:?p Thr Glu Val His Asn Val Trp Ala Thr His Ala Gys Val 

31 40 45 

Pro Thr A:; p Pro Asn Pro Gin Glu Val Val leu Gly Asn Val Thr Glu 

50 ' 55 6cT 

Asn Phe A: : ;. Met Gly Lys Asn Asn Met Val Glu Gin Met His Glu Asp 
65 70 7 5 8 0 

1 Le lie S c \y Leu Trp Asp Gin Ser Leu Lys Pro Gys Val Lys Leu Thr 

8 5 90 95 

Pro Leu Cys Val Thr Leu Asn Gys Thr Lys Leu Lys Asn Ser Thr Asp 

100 105 " ' 110 

Thr Asn Acr. Thr Arg Trp Gly Thr Gin G.-u Met Lys Asn Cys Ser Phe 

12 5 " 120 125 

Asn lie Ser Thr Ser Val Arg Asn Lys Met Lys Arg Olu Tyr Ala Leu 

130 135 " 140 

Phe Tyr r Leu Asp lie Val Pr z> lie Asp Asn Asp Asn Thr Ser Tyr 
14 5 150 155 160 

Arg Leu Arg Ser Gys Asn Thr Ser lie lie Thr Sin Ala Cys Pro Lys 

165 170 175 

Val Ser Pr.e Glu Pro lie Pro He His Phe Gys Ala Pro Ala Gly Phe 

:30 1S5 190 

Ala lie Leu Lvs Gys Asn Asn Lys Thr Fne Asn Sly Thr Gly Pro Cys 

195 * 200 205 

Thr Asn Val Ser Thr Val Gin 2ys Thr His Gly lie Arg Pro Val Val 

1 P ? 
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Ser Thr Gin Leu Leu Leu Asn Gly 5er Leu Ala Glu Glu Glu Val Val 
225 230 235 240 

He Arg Ser Glu Asn : : r.e Thr Asn As:: Ala Lys Thr lie He Val Gin 

245 253 255 

Leu Asn Glu Ser Val Glu He Asn Cys Thr Arg Fro Asn Asn Asn Thr 

2^0 265 270 

Arg Lys Ser He His lie Gly Pro Gly Arg Ala Phe Tyr Thr Thr Gly 

275 280 285 

Asp lie ll-i G-y Asp He Arg Gin Ala His Cys Asn lie Ser Arg Thr 

29) 295 300 

Asn Trp Tnr Asn Thr Lea Lys Arg Val Ala 31 j Lys Leu Arg Glu Lys 
305 310 315 32 ,;. 

■Ptr^ Asm A;:; Tnr Thr He Val Phe Asn- -Gin Ser Ser 'Sly Gly Asp Pre 

325 333 ' ' 335 

Glu He Vs.. Met His ;er Phe Asn Cys G I / Gly Glu Phe Phe Tyr Cys 

34 0 34 5 ' " 350 

As:: Thr Thr GH Leu Pee Asn Ser Thr Trp Asn Glu Thr Asn Ser Glu 

3 5; 360 ' 365 

Gly Asn : i.-: Tnr Ser Gly Thr He Thr Leu ?r: Cys Arg lie Lys Gin 

37i 375 380 

He H-; An M-it Trp Gin Glu Val Gly Ly^ Ala Met Tyr Ala Pro Pre 
36 5 3 9 0 39: 4 0C 

He Gly Gly G..n lie Lys 3ys Leu Ser Asn He Thr Gly Leu Leu Leu 

4 0 5 410 415 

Thr .Ar j Asp Giy Gly Ser Asp Asn Ser 
420 425 

21 H 4 7 

■211 • 1277 

•:2i;:> D1IA 

213:- Artificial Sequence 



LH- Ci)3 



( 1) . . - (1277) 



• 4 0". ■ 4 _ - 

ago gc<; g^c gac ego ~tg tgg gtg ace gtg tac tac ggc gtg ccc 

Ala Ser A . a Ala Asp Arg Leu Trp Val Thr Val Tyr Tyr Gly Val Pro 

1 5 I j 15 

gtg tgc a^g gac gec acc ace ac: ctg ttc tgc gec age gac gec aag 

Val Trp Lys Asp Ala Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys 

2 0 2 5 3 0 

gec tac gac acc gag gtg :a: aa: gt ; tgg gec ace eac gcg tgc gtg 

Ala Tyr Asp Thr Glu Val His Asn Val Trp Ala Thr His Ala Cys Val 

40 4 5 

sec see gac ccc aac ccc rag gag gtg gtg ctg ggc aac gtg acc gac 

Pro Thr Asp Pro Asn Pro Gin Glu Val Val Leu Gly Asn Val Thr Glu 

5 0 5 5 60 

aac ttc a.-.c atg ggc aag aac aac ate gtg gag cag atg sac gag gat 

Asn Phe A in Met Gly Lys Asn Asn Met Val GH Gin Met His Glu Asp 

65 7 : 7 5 80 

atc ate age ctg tgg gac age ctg aag ccc tgc gtg aag ctg acc 

lie He Ser Leu Trp Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr 
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ccc ctg tgc gtg 
Pro Lee Cys Va 1 

IOC 



ctg caa tgc ace a; 
Lea Gin Cys 



Thr 
~. 1 ^ j 



aag :;ag age acc ga c 
Lys Gin Ser Thr A.3p 
110 



acc eag aac ace ::c tgg gge ac: eag gag atg sag :^a: tgc age tic 
Thr 31n Asn Thr Arg Trp Gly Thr Gxr. Glu Me:: Lys Asa Cys Ser Pne 
115 12 J ' 125 



aac acc age acc age gtg cge aa: sag atg aag :g; gag tac gec ctg 
Asn Lie Ser r hr H-r Val Arg Asn Lys Met Lys Arg Gel J Tyr Ala Leu 



130 



133 



4 32 



ttc tac ag< 
Phe Tyr So: 
14 5 



et g g ic ate gt g cc c it c 
,eu Acr> lie Val Pro He 



cge ctg cge age tgc aa; aca teg 
Arg Leu Arg :er Z /s Asn Tnr Ser 



gae aac ga: aa: ac: age tac 

Asp Asn Asp Asn Thr 3er Tyr 

15 5 1 -n 

:c tgc :c: aag 



at : ate aee :ag 
lie He 
1" 



Thr Sin Ala Cys Pro Lys 

! 1 



:2S 



gtg age ttc gag : :c ate ccc ate cac ttc tgc gee ccc gee gge ttc 
Val Ser Phe ale Pro He Pro lie A is Phe Cys Ala Pro Ala Sly Phe 



13 : 



190 



5^6 



gee ate ctg aag tgc aac aae aag ace ttc aac gge acc gge ccc tgc 



Ala lie Leu Ly s 
195 



Asn Asn Lys Thr Phe Asn Gly Thr Gly Pre Cys 



200 



205 



acc aac gtg age acc gtg eag tgc acc cac gga att eg: ccc gtg gtg 
Thr Asn Val Ser Tnr Val Gin Cys Thr His Gly He Arg Pro Val Val 



2 1. 0 



2 15 



220 



672 



age acc eaq ctg ctg ctg aac gge age ctg gee gag gag gag gtg gtg 720 
Ser Thr GLn Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Glu Val Val 



225 



2 30 



2:35 



24 0 



tet gag aac ttc ace aac aae gec aag acc ate ate gtg eag 



ate aga 

Tie Are Ser Glu Asn Phe Thr Asn Asn Ala 



245 



250 



Thr lie He Val Gin 
255 



7*58 



ctg aac gag age gtg gag ate aac tg: acc ege ccc aa: aae aac acc 
Leu Asn Glu Ser Val 3iu He Asn Cys Thr Arg ?r) Asn Asn Asn Tnr 



265 



2 7 ) 



P 1 f. 



cge aag age ate eac ate gge cct gge cgc gee ttc 
Arg Lys Ser lie His He Gly Pro Gly Arg Ala Phe 



275 



2fi0 



tac ace ace gge 
Tyr Thr Thr G.y 
23 5 



3 64 



gac ate ate gge eac 3t e -cge eag gc: cac tgc 
Asp He He Gly Asp lie Arg Gin Ala His Cys 



2 9 5 



aae ate tet aga acc 
Asn He Ser Ara Tnr 



'12 



sac tgg acc aae acc ctg aag 



gec gag aag etc cge gag aag 



3C5 



Asn Thr Leu Lvs Are Val Ala Glu Lvs Leu Are* Slu Lvs 



HO 



51: 



1 1 : 



acc acc ate gtg ttc aac 



gge gge cac cc 



WO 00/29561 



22 



PCT/DK0O/0OU4 



Phe Asn Asn Th 



Thr lie Val Phe Asr. Gin Ser Ser 
325 330 



gag 
Glu 



at : 
lie 



at 3 
Met 
34 0 



cac 



age ttc aac tgc ggc ggc gag 
Ser Phe Asn C/s Gly Civ 01 u 
345 



Gly Gly Asp Pro 
3 35 

ttc ttc ta: tgc 
Pne Phe Tyr Cys 
350 



1056 



3 3-; 



3gc 

i V 



3 C " 

Thr 



aa ; 
*\ .-. 

37 ) 



Tnr 



-ag ctg ttc aac age acc 

3 In Leu Phe Asn Ser Thr 
360 

3.zz act ggc acc ate acc 

H.r Ser Gly Thr He Th~r 
375 



tgg 
Trp 



ctg 
Leu 



aac gag arc aa: ag: gag 11C4 
Asn GIu Thr Asn Ser Glu 

365 

ccc t ;c cgc ate aag cag 1152 
Pre Cys Arg lie Lys Gin 

350 



ate 
I le 
^ r, 



.a ■ i ~S 



acc 
Th: 



cgc 
Arg 



gac 
Asp 



a t -3 



cag 
Gin 



ggc 
G _ y 
•1 2 0 



tgg 
Trp 



ate 

4 05 

ggc 
Gl v 



cag gag gtg ggc 
Gin GIu Val Gly 

390 

aag tgc ctg age 
Lys Cys Leu Ser 



age gac aac teg 
Ser Asp Asn Ser 
425 



aag 

Lvs 



ci a ■_ 

Asn 
413 

ag 



Ala 

3 95 

ate 
lie 



atg 



aoc 
Thr 



tac gee ccc ccc 
Tyr Ala Pro Pre 
4 00 

ggc ctg ctg ctg 
Giy Leu L_u Leu 
415 



12 G( 



1248 



111 



A 1 a 
1 

Val 



2iO> 48 
211:- 425 
212> PRT 
- 215:- Artificial Sequence 



-.4CK'> 48 
Ser Ala Ala Asp Arg Leu Trp 

Trp Lys Asp Ala Thr Thr Thr 

2 0 

Tyr A: p Thr Glu Val His Asn 



Val 

Leu 
2 5 
Val 



Thr Val 
10 

Phe 3ys 



,-:r Tvr Hv 



A. a 

Trp Ala Thr 



Ala 

Pro Thr A-c Pro Asn Pro Gin Glu Val Va~ Leu 



40 



65 

I. ie 

Pro 
Thr 



50 55 
Phe Asn Met Gly Lys Asn Asn 
70 

He Ser Lou Trp Asp Gin Ser 
85' 

Leu Cys Val Thr Leu Gin Cys 
100 

Gin Asn Thr Arg Trp Gly Thr 



115 



120 



Asn lie Ser Thr Ser Val Arg Asn 



130 



135 



Phe 
14 5 
Ara 



Tyr Ser Leu Asp He Val Pre 

150 



Leu Arq Ser 

Ser Phe G 1 u 
1 S i 

lie Leu Lvs 



165 



•s Asn Thr Ser 
Pro lie Pro He 
C'.'S Asn Asn Lvs 



Val 

Met 

Leu 

Thr 
105 
Gin 

Lys 

He 

He 

His 
165 
Thr 



Val ijiu 
7 5 

Lys Pro 

90 

Lys Leu 

Glu Met 

Met Lys 

Asp Asn 

He Thr 
170 

Phe Cys 
Pne Asn 



60 
Gin 



Arg 
1 4 0 
Asc 



Ser Asp 

30 

His Ala 
45 

Asn Val 

Met His 

Val Lys 

Gin Sei- 
ilC 
Asn Cys 
125 

GIu Tyr 

Asn Thr 

Ala Cys 

Pro Ala 
190 
Thr Gly 



Val Pro 

1 5 

Ala Lys 

Cys Val 

Thr Glu 

Glu Asp 

90 
Leu Thr 

95 

Thr Asp 
Ser Phe 
Ala Leu 
Ser Tvr 



Pro Lys 
175 

Giy Phe 
Pre Cvs 
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195 200 







i 7 J 














Asn 




Ser 


Thr 


Va.. 


Gin 


Cys 




210 










215 




S s r 


Thr 




Leu 


Leu 


Le u 


.As n 


G 1 y 


^, ^ C. 










2 30 






ile 


Arg 


3e r 


' JiU 


Asn 


Phe 


Thr 


As:: 










4 ^ 








Leu 


Asn 




3 e r 

2 60 


1 ^ i 


G 1 a 


. le 


Ast\ 


Arg 


*j y s 


,7j e r 


1 ^e 


H s 


lie 


Gly 


Pro 






2 ~ s 












A s p 


lie 


11'; 


- - y 


Asp 


I le 


Arg 


Gin 




2 9t< 










2 95 




Asn 


Trp 


Tnr 


Asn 


Thr 


Lea 


Lys 


Arg 


30 5 










3 10 






Phe 


Asn 


A "n 


Thr 


T h. r 


I le 


Val 


Phe 










3.: 5 








Giu 


lie 


' ; - - 


Met 


i"i j. 3 


.;er 


r J 1 e 


As:i 








3-1 0 










Asr. 


Thr 


T"ir 


G- n 


Leu 


Phe 


Asn 


Ser 
















3*50 


l 3 1 / 


A. sr. 




T hr 


Ger 


G.l y 


Thr 






37 0 










375 




Ile 


lie 


A • 1 1 


2!ec 


Trp 


Gin 


Glu 


Val 


38 5 










390 






1 1 e 


G 1 y 


1 ^ ^ y 


Gin 


lie 


Lys 


Gys 


Leu 










•i U D 








Thr 


Arg 




Gry 


GLy 


.5 er 


Asp 


Asn 








4 20 














1 !) : • 


■J 9 












- iL 


i : . 


L-l -1 












;2 




PRT 












■:21 3 • 


Art! t lc 


13.I Sequence 






•1 9 












ag : 


?=g 


gec 


gac 


:gc 


ctg 


tgg 






aag 


gac 


gee 


3CC 


acc 


ae ■; 


gc : 


ta : 


g:ac 


acc 


3 


3 ~g 


cac 


aac 


l " " 


ac : 


gac 




aac 




:ag 


gag 


33 I! 


tt : 


i ae 


arg 


gg- 


aag 


aac 


aa e 


at: 


at : 


a gc 


:tg 


~gg 


gac 


cag 


ag: 




ctg 


t gc 


gtg 


a - : 


:tg 


:aa 


r -g ~ 


d ■_ 


ca 5 


aac 


a cc 


eg: 


tgg 




ae : 




at : 


age 


acc 


33: 




eg: 


aac 




ta : 


age 


:tg 


g- 3,: 


at : 


g- 3 




i - 9 - 


ct 3 




3CC 




a a : 


ac a 


teg 


at" a 

g L 3 


ag: 




gag 




at z 




at : 


QC : 


at : 




aag 




3 ac 


aa : 


a a g 




aa: 


g- j 


a g: 


3 : : 


3-3 


cag 


t gc 


3 ^ „ 








- - a 


— 3 


a a : 


33 : 


3 t : 


aga 




g ag 


3a: 




ae : 


aa : 




aac 


gag 


3 g 3 


gt g 


gag 


at : 


a a : 




a a g 


age 


at: 




ate 


3g = 




ga e 


at c 


3tC 


3 3- 


aa: 


ate 


eg: 


:a g 


aa: 


tgg 


a ee 


a a s 


age 


:tg 


aa g 


: g : 


tt: 


aac 


a a c 


a:: 


a e : 


at e 


gt g 






at c 


gt c 


a t g 




age 


_ ^ c 


a a : 


3 a c 


a:: 


a:: 


cag 






aac 


ag: 
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2 0 5 










i . ir 


n — 0 




q. -L " 


A -a 

n - g 


Pro 
































220 














^1 


A a 


Glu 


Giu 


Giu 


Va 1 










" " 5 










> 4 0 

i- T V 




A s n 




L y s 


Th r 


I ie 


lie 


Val 


Gin 
















255 








Thr 




? r ■') 


Asn 


Asn 


Asn 


Thr 




HL.. 'J -./ 










2 70 








■ j. j 


i_ ^ 


Al a 


P h e 


t- - r 
1 .7 L 




Thr 


G1 v 












2 3 5 










Al ^ 

/M ul 


1 1 1 j 






*" ! ^ 




g 


Thr 






















' 1 




G 1 '1 1 




^ Li 




G 1 r 


x-t V O 








3 1 










3°0 




i 1 






e ^ 


G 1 - ' 


J UL . 


As p 


" r 0 
















3 3 5 














Phe 


Phe 


T 'v >^ 


Gv 














3 5d 








TV, v- 


... 

■ r p 


/-\S 


' j .» u 




noil 




r "q " 1-1 
>a U 












^ h 5 














Pre 


p > , 


A ^'7 


r , 














360 












Gly 


Lys 


Ala 


Met 


Tyr 


Ala 


Pro 


Pro 


- 






3 95 










400 




3er 


Asn 


Tie 


Thr 


Gly 


Leu 


Leu 


Leu 






■1 1 0 










415 






3er 


















4 25 


















) - -i 


a 1: c 


gtg 

^ w 7 


t a : 


t a : 


is-' 


qt g 


"C - 




■:tg 


1 1 c 




g c :: 


a g :: 

t? 


gac 

a 


qcc 

a 


aag 

a 


96 


:i - J 


"CO 

- jy 






TJ Gi 2 


•i - -j 


- d J 


I 1 " 
a - a 


1 4 4 


g -g 


g"g 


- - g 


J -f - 




J - -J 


^ 1"" 
a - ^ 


a >i a 


"i Q " 
1. - t— 




7tq 


qaq 


:a 7 


at a 


:ac 


gag 

.' a 


qa 

a - 


2 A C 




aag 




tg : 


gtg 

3 a 


aag 


:tg 

a 


ac : 


2 8 S 


a ee 


3 a a 




aaa 


cag 

" a 


age 

a ' 


ae : 


ga c 


3 36 


cag 


gaq 


atg 


aag 


a a e 


t g>: 


age 


tt : 


334 


aag 


1 tg 


aag 


:g: 


gag 


t a : 


ZjZZ 


ctg 


4 32 


ate 


;ae 


aac 


g a : 


:ag 


a : : 


a g : 


t ac 


4 80 


ate 


1 1 e 


ac : 


z a 7 




- -i " 




aac 


5 28 


■:a : 


tte 


tg : 


gee 






g g : 


tt : 


5^6 


a : : 


*" t C 


aa : 


51c 


ac c 


igc 

a a 




tg : 


0 2 4 


a : : 


:ac 


7 q a. 


at t 


: gc 

a 




gtg 

a a 


qtq 

a 2 




a j _ 


^ t c 

- - y 


d 1 


e a ^ 


7 a q 

J 3 a 


7 aq 

a J a 


gt g 

a *- a 


q*^ g 




a a : 




aa 3 


a : e 


a t e 


at : 


gtg 

a ^ a 


cag 
^ J a 




r a a 




^ ^ — 




a a e 


3 a : 


a a e 


a : : 




7 ^ e 




^ c e 




t ac 


ac : 


a : e 


g3 = 






e a c 


tg: 


aac 


ate 


t ct 


aga 


acc 




gtg 


M 


ga g 


aag 


ctg 




gag 


aag 




3 3C 


ca g 


ag : 


t ::: 


ggc 


3g- 


gac 


c :c 




tgc 




gg: 


gag 


tte 




t ac 






a:: 




aac 


qaq 




aac 


age 


gag 
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ggc aac =:c act agt ggc acc ate ace ctg ccc tgc cgc ate aag cag 1152 

ate at: aac atg tgg cag gag gtg ggc aag gee atg tac gec ccc ccc 1200 

ate ggc ggc cag ate aag tgc ctg age aac ate aec ggc ctg ctg ctg 1243 

sec cgc aac ggc ggc age qac aac teg ag 1277 

2io> so 

21H 4 25 

:i2> prt 

21 3> Artificial Sequence 

< 1 0 0 ;• 5 0 

Ala Ser Ala Ala Asp Arg Leu Trp Vai Thr 7a 1 Tyr Tyr Gly 7a 1 Pro 

7al Trp Lye Asp Ala Thr Thr Thr Leu Fhe Cys Ala Ser Asp Ala Lys 

2 0 2 5 3C 

Ala Tyr Asp Thr Glu 7a 1 His Asn 7a 1 Trp A 1 a Thr His Ala Cys 7ai 

3 j -10 ' 4 5 

Pro Thr Acr Pro Asn Pro Gin ]lu 7al 7al Leu Giy Ast 7a 1 Thr Glu 

50 55 60 

Asn Phe Asn Met Gly Lys Asn Asn Met 7a 1 Glu Gin Met His Glu Aso 
65 7 0 7 5 8 0" 

.lie lie Ser Leu Trp Asp Gin Ser Leu Lys Pro Cys 7al Lys Leu Thr 

■3 5 90 ' 95 

Pre Leu Cys Vai Thr Leu Gin Cys Thr Lys Leu Lys Gin Ser Thr Asp 

100 105 110 

Thr Gin Asn Thr Arg Trp Gly Thr Gin Glu Met Lys Asn Cys Ser Phe 

U5 120 ' 125 

Gin He Ser Thr Ser Vai Arg Asn Lys Met Lvs Arg Glu Tyr Ala Leu 

130 135 140 

Phe Tyr Ser Leu Asp He Vai Pro He Asp Asn Asp Gin Thr Ser Tyr 
145 150 155 160 

Arg Leu Arg Ser Cys Asn Thr Ser lie lie Thr Gin Ala Cys Pro Lys 

L65 170 175 

Vai Ser Phe Glu Pro He Pro He His Phe Cys Ala Pro Ala Gly Phe 

ISO 185 190 

Ala He Leu Lys Cys Asn Asn Lys Thr Phe Asn G;y Thr Gly Pro Cvs 

195 200 205 

Thr .Asn Vai Ser Thr Vai Gin Cys Thr His Giv He Arg Pro Vai Vai 

210 215 ' 2.0 

Ser Thr Ln Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Glu Vai Vai 
225 230 235 240 

He Arg Ser Glu Asn Phe Thr Asn Asn Ala Lys Thr He lie Vai Gin 

24 5 2 5 C 255. 

Leu Asn Glu Ser Vai Giu lie Asn Cys Thr Arg Pro Asn Asn Asn Thr 

260 265 270 

Arg Lys 5er He His He Gly Pro Giy Arg Ala Phe Tyr Thr Thr GJy 

2^5 280 285 

Asp He He G 

290 295 " 3C0 

Asn Trp Trr Asn Thr Leu Lys Arg Vai Ala Giu L\ s Leu Arg Glu Lvs 
305 310 3 1 :■ * 320 

Fhe Asn Asn Thr Thr He Vai The Asn Gin Ser Ser Giy Gly Asp Pro 

32 5 3 3." ' " 335 

Glu lie 7a 1 Met his Ser Phe Asn Cys Giy Gly Glu Phe Phe Tyr Cys 

34 0 34 5 3 50 

Asn Thr Thr Gin Leu Phe Asn Ser Thr Trp Asr. Glu Thr Asn Ser Giu 

355 260 365 

Gly Asn Tie Thr Ser Gly Thr He Thr Leu Fro Cys Arg lie Lys Gin 
r " 3" 5 3 SO 



ly Asp He Arg Gin Ala His Cys Asn lie Ser Arg Thr 
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25 



lie lie Asn Met Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pre 



385 
lie 



3 90 



Gr / ^-y Gin He Lys lys Leu Ser 
4 3 5 

Thr Arg Asp 3iy Gly 3er Asp Asn Ser 



420 



395 

Asn He Thr 
4 10 



42 c 



4 C 0 

Gly Leu Leu Leu 



H10 
' 2 1 1 
■'212 
:213 



14 4 
P2T 

Artificial Sequence 







;co ■ 


5 . 


























get 


a g- : 


g eg 


g::c 


g ac 


eg z 


ct g 


tgg 


■3tg 


ace 


gtg 


tac 


ta c 


gg ^ 


gtg 


cc e 


gtg 


tg7 


aa-i 


gac 


gee 


ac: 


acc 


ac z 


ctg 


ttc 


tgc 


gee 


ac c 


ga e 


gec 


aa g 


gec 


ta: 


5 a c 


acc 


g^5 


gtg 


ea z 


aac 


■3t g 


tgg 


qc c 


ac e 


ea c 


geg 


tgc 


gt g 


3 - C 


ac c 


g i-- 


z . z 


aa: 




cag 


gag 


gt g 


gt g 


^ ^ 3 


'3 3 = 


aa c 


gtg 


acc 


ga g 


aa: 


c 


3 a : 


a* g 


gge 


aag 


aac 


aac 


at g 


gt g 


gag 


cag 


at g 


e a e 


gag 


gat 


at: 


a: ; 




- 3 


t 3-3 


gac 


cag 


age 


ct g 


aag 




tge 


■3~'3 


aag 


ctg 


ac e 


c c c 




t 


■r ? 


a:: 


ct g 


aac 


tgc 


acc 


aag 


ctg 


aag 


aac 


age 


acc 


gac 


ac: 


aa-: 


a i". 


a - c 


c g _ 


tgg 


gge 


ac z 


ca g 


'3 1 3 


at g 


aag 


aa e 


tge 


age 


ct e 


cag 


at : 


ag ? 


a . c 


a g - 


3- 3 


cgc 


aac 


a a g 


at g 


aag 


cgc 


gag 


tac 


gee 


ctg 




tac 


ag ; 


ctg 


gac 


ate 


gtg 


cce 


at c 


gac 


aa e 


gae 


cag 


■ace 


age 


tac 


cgc 


■zzi 


eg : 


a ic 


tgc 


aac 


aca 


teg 


ate 


ate 


acc 


cag 


gee 


tgc 




aag 


gtg 


ag : 




gag 


c:c 


ate 


ecc 


at z 


cac 


ttc 


tgc 


gec 




gec 


3 3 = 


ttc 


gec 


at c 


: - r - 3 


a*g 


t gC 


aa c 


aac 


aag 


a cc 


ttc 


aae 


gge 


ace 


gge 




tgc 


.ac c 


aae 


gt i 


age 


acc 


gtg 


cag 


tgc 


ac z 


cac 


gga 


att 


ege 




gtg 


gtg 


age 


ac z 


:a-: 


erg 


ctg 


ctg 


aac 


gge 


age 


ctg 


ge c 


gag 


gag 


gag 


gtg 


■3tg 


ate 


aga 


tct 


gag 


a ac 


ttc 


acc 


aac 


a a e 


gec 


aag 


ac c 


ate 


ate 


gt g 


eag 


ct g 


a ac 


ga i 


age 


gtg 


3 a -3 


ate 


aac 


tgc 


acc 


ege 


cce 


aac 


aac 


aac 


ace 


cgc 


aaq 


age 


ate 


cac 


ate 


gge 


cct 


gge 


cgc 


gee 


ttc 


tac 


acc 


a cc 


gge 


gac 


ate 


atc 


g jc 


ga c 


ate 


cgc 


cag 


gee 


cac 


tg e 


aac 


ate 




aga 


acc 


a a e 


-'3'? 


ac .* 


aac 


a ec 


ctg 


aag 


cgc 


gtg 


( 3'- ,j 


■3*3 


aag 


ctg 


cgc 


gag 


aag 




a a c 


a a ■; 


aec 


acc 


ate 


gtg 


tt z 


aac 


cag 


age 


tec 


gge 


gge 


gae 


ec e 


gag 


ate 


■? 3 


atg 


c ac 


age 


ttc 


aac 


tge 


gge 


gge 


gag 






t ac 


tgc 


aa c 


a ec 


ace 


cag 


ctg 


ttc 


aac 


age 


ac c 


r -'3 3 


aac 


gag 


ac c 


aac 


age 


gag 


'3'3- 


aac 


at : 


art 


a gt 


gge 


ace 


ate 


ace 


ct g 




tg e 


eg e 


ate 


aag 


ca g 


■ate 


a t ■ : 


a a : 


at g 


tgg 


cag 


gag 


gtg 


gge 


aag 


gec 


atg 


tac 


ge e 


cc c 




ate 


gge 


gge 


cag 


ate 


aag 


tge 


ctg 


age 


aac 


ate 


ac e 


! 3 : 3'= 


ctg 


erg 


ctg 


,ac; 


cgc 


ga,: 


gee 


gg c 


age 


gac 


aac 




ag 















-4 1 ■ 

2Qy 
33 v 
c A 
i . 'i «_ 
48n 

57 o 

u 2 
7 2' i'i 
7 6:- 
Bin 
Hb4 
'? 1 2 
^ 6' 1 
KjQs 

H>5n 

i : o 

115. 

12 Of 
1243 



■11') ■ 5.. 
• 211 - 425 
-212 • PF.T 
213- Artificial Sequence 



4 0:") . 5; 

Ala Ser Ala A. a Asp Arg Leu Trp Val Tnr Val T 

1 5 13 

Val Trn L v A.:d Ala Thr Thr Thr Leu Pne lvs Ala Ser Asp Ala lvs 



Tyr Gly Val ?r 



Ala Tyr Asp T:.r GLu Val His Asn 
3 5 4 0 



Thr Aip Pr 

5 0 



1 j j. » 
55 



Asn Phe Asn M-;t Gly Lys Asn Asn Met Val 
65 ' 70 

lie Tie Ser L^u Tro Asc 1 



30 

Trp Ala Thr -A. s Ala Gys Val 
4 5 

Val leu Gly Asn Val Thr 2iu 
63 

u Gin Xet His Gl 



Ser Leu Lvs Pro Cvs 



„ v s 



8 0 
Thr 
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Pro Leu Cys Val Thr Leu Asn Cys Thr Lvs Leu L> s Asn Ser Thr Asp 

IOC 105 110 

Thr Asn Ait. Thr Arg Trp Gly Thr Gin Giu Met Lys Asn Cys Ser Ph*= 

115 120 125 

Gin ILe Ser Thr Ser Val Arg Asn Lys Met Lvs Giu Tyr Ala Leu 

1?C- 135 ' 140 

Phe T\r Ser Leu Asp lie Val Pro lie Aso Asn Asrj Gin Thr Se- m yr 
145 150 " 155 ' 160 

Arg Leu Arg Ser Cys Asn Thr Ser lie lie Thr Gin Ala Cys Pre Lys 

165 170 175 

7a- 3cr Phe Giu Pro lie Pro He His Phe Cys A^a Pro Ala Gly Phe 

180 195 190 

"*^ra He Leu Lys Cys Asn Asn Lys Trrr- Phe Asn Giv Thr Gly Pro Cys 

155 200 * 205 

Tnr: A- n Val Ser Thr Val Gin Cys Tnr His Giv lie Ara Pro Val Val 

2K; 215 22:) 

Ser Thr Gin Leu Leu Leu Asn GJ y Ser Leu Ala Gl.: Giu Giu Val Val 
22 -> 230 235 2.30 

lie Arg Ser Hu Asn Phe Thr Asn Asn Ala Lys Thr lie lie Val Gin 

2 4 5 2 5 0 2 55 

Leu Asn Giu Ser Val Giu He Asn Cys Thr Arg Pr:> Asn Asn Asn Thr 

260 265 270 

Arg Lys Ser He His lie Gly Pro Gly Arg Ala Phe Tyr Thr Thr Gly 

27 5 280 2S5 

Asp He He Gly Asp lie Arg Gin Ala His Cys Asn He Ser Arg Thr 

290 295 300 

Asn Trp Thr Asn Thr Leu Lys Arg Val Ala Giu Lys Leu Arg Giu Lvs 
305 310 315 320 

Phe Asn Asn Thr Thr He Val Phe Asn Gin Ser Ser Gly Gly Asp Pro 

325 330 335 

Giu He Val Met His Ser Phe Asn Cys Gly Gly Giu Phe Phe Tyr Cys 

340 345 350 

Asn Thr Thr Gin Leu Phe Asn Ser Thr Trp Asn Giu Thr Asn Ser Giu 

^55 360 365 

Gly Asn He Thr Ser Gly Thr He Thr Leu Pro Cvs Arg He Lys Gin 

37 '^ 375 380 

lie 1 1.? Asn Met Trp Gin Giu Val Gly Lys Ala Met Tyr Ala Pro Pro 
385 390 395 * 400 

He Giv Gly Gin He Lys Cys Leu Ser Asn He Thr Gly Leu Leu Leu 

405 410 ' 415 

Thr Ar i Asp Gly Gly Ser Asp Asn Ser 
420 425 

■:210> 53 
•:211> 432 
-2 12> DMA 

H13 - Artificial Sequence 

:220> 

:2;:i - CDS 
"L2> ( 1 ) ... (4 32) 

■:4 00> 5 3 

tet aga gcg eta cct cca gga cca acq ctt ect ggg cat gtg ggg ctg 

Ser Ar j Ala Leu Pre Pro Gly Pr 3 Ala Leu Pro Gly His Val Gly Leu 

1 5 i j ' 15 

etc egg caa get gat ctg cac cac ggc cgt gec etc caa cgc eag ctg 
Leu Arg Gin Ala Asp Leu His His Gly Ara Ala Leu Giu Arg Gin Leu 
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2 0 2 5 3 3 

gag caa caa gaa cc: gag cca gat ttg gga caa cat gac ctg gat gga 14 4 

Glu Gin :31n Glu Pro Glu Fro Asp Leu Gly Gin His Asp Leu Asp Gly 
3 5 4 0 4 5 

gtg gga gee cga gat cag caa eta cac cga gat cat eta cag ect gat 192 

Vai Gly Ala Arg Asp Gin Gin Leu His Arg Asp His Leu 31n Pro Asd 

50 55 6D 

cga gga gag oca gaa cca gca gga gaa caa cga get gga cct get cca 2 4 C 

.Arg Gly 31: Pro Glu Pro Ala 31y Glu Glu Arg Ala Gly Pro Ala Pro 

"to 7 0 7 5 3 0 

get ggn caa gtg ggc aag ctt gtg iaa ctg gtt caa cat cac :aa ctg OSS 

Ala Gly Gin Vai Gly Lys Leu Val Glu Leu Val Gin His His 3ir. Lee 

SO 90 9 5 

get gt j ::ta cat caa gat ttt eat eat gat cgt ggg egg cct gat egg 336 

Ala Va L /ai His Gin Asp Phe His His Asp Arg Gly Arg Pro Aso Arg 
IOC 105 " 110 

cct gcg eat cgt gtt eac cgt get gag eat cgt gaa ccg cgt gcg cca 38 4 

Pro Ala His Arg Val His Arg Ala Glu His Arg Glu Pro Arg Ala Pro 
115 120 125 

ggg eta cag ccc cct gag ctt cca gac ecu cct gec cgt gee ecg egg 4 32 

Gly Leu Gin Pro Pro Glu Leu Pro Asp Pro Pro Ala Arg Ala Pro Arg 
13i 135 140 



• 2 : 54 

■::2ir- 144 
■ ■ 2 i. 2 > PF;T 

• 2! Artificial Sequence 

• 4 "nV- 5 4 

Ser Arc A^a Leu Pro Pro Gly Pro Ala Leu Pro Gly His Vai Gly Leu 

1 5 10 15 

Leu Arc Gin Ala Asp Leu His His Gly Arc Ala leu Glu Arg- Gin Lei: 

2C 25 3C" 

Glu Gin Gin Glu Pre Glu Pro Asp Leu Gly Gin His Asp Leu Asp Gly 

35 4 0 4 5 

Val Gly Ala Arg Asp Gin Gin Leu His Am Asp His Leu Gin Pro Asp 

50 55 .30 

Arg Gly G^u Pro Glu Pro Ala Gly Glu Glu Arg Ala Gly Pro Ala Pro 
65 70 75 6 0 

Ala Gly Gin Val Gly Lys Leu Val Glu Leu Vai Gin His His Glr. Leu 

8 5 90 3 5 

Ala Vai Vai His Gin Asp Phe His His Asp Arg Gly Arg Pro Asp Arg 

100 105 * 110 

Pre Ala Hir Arg Val His Arg Ala Glu His Arc Glu Fro Arg Ala Pro 
11: ' 120 " 125 

Gin Pro Pro Glu Leu Pro Asc Pro Pro Ala Arc Val Thr Asd 



Ma- 



13 0' 13 5 14 0 



WO 00/29561 



28 



PCT/DK00/O0I44 



'■213 > Artificial Sequence 

■:220> 

■■222 > ( 1 ; . . . ' 4 34 ) 

■:40C> 55 

:c: c -c~ rca cct cca gga cca gcg ctt cct ggg cat gtg ggg ctg 4 3 

Ser Ar } Ala Leu Pro Pro Gly Pro Ala Leu Pro Giy His Val Sly Leu 
1 5 10 15 

etc :g j can ;ct gat ctg cac cac ggc cgt gec ctg gaa cgc rag ctg 96 
-beu Ar j C-in Ala Asp Leu His His Gty- Arg Ala Leu Glu Arg Gin Leu 

20 25 30 

gag :ai cas gaa cct gag eca gat ttg gga caa rat gac rtg gat gga 14 4 

Glu 21r. Cln G.-u Pro Glu Pro Asp Leu Gly Gin His Asp Leu Asp Gly 
3 5 4 0 4 5 

gtg gg.t qc j cga gat cag caa eta cac cga gat rat eta rag crt gat 192 
Val Hy Ali Arg Asp Gin Gin Leu His Arg Asp His Leu Gin Pro Asp 
5:- 55 60 

cga gg.i ga j cca gaa cca gca gga gaa gaa cga rjee gga cct grt cca 2AG 
Arg Gly Glu Pro 31u Pro Ala Gly Glu Glu Arg Ala Gly Pro Ala Pro 
65 70 75 30 

get gg ;1 :aa gt . g ggc aag ctt gtg gaa ctg gtt caa cat cac caa ctg 288 
Ala Gly ~ln Val Giy Lys Leu Val Glu Leu Val Sin His His GLn Leu 
8 5 90 95 

get gtg rta cat caa gat ttt cat cat gat cgt ggg egg cct gat egg 3 3 6 

Ala Va .. Val His Gin Asp Phe His His Asp Arg Gly Arg Pro Asp Arg 
100 105 110 

cct gcu cat cgt gtt: cac cgt get gag cat cgt gaa ccg cgt gcg cca 36 4 

Pro Al , His Arg Val His Arg Ala Glu His Arg Glu Pro Arg Ala Pro 
II:) 120 125 

ggg atg rag ccc cct. gag ctt cca gac ccg cct gec cct gtg acg gat 4j2 
Gly Met 31 n Pre Pro Glu Leu Pro Asp Pro Fro Ala Arg Val Thr Asp 
13 ■ 135 * 140 



■ 2 10:- 56 

•:21V- 144 

• 212 • ?F.T 

• 213:- Artificial Sequence 

• 4 'O • ; t 

Ser Arc Ala Leu Pro Pro Gly Pro Ala Leu Pro Gly His Val Giy Leu 

Leu Arg Gin Ala Asp Leu His His Giv Arc Ala Leu Glu Arq G.r. Leu 

20 25 3C" 

Glu Gin Gin Glu Pro Glu Pro Asp Leu Glv Gin His Aso Leu Asp Glv 

35 40 ' 4b 

Val Giy Ala Arq Asp Gin Gin Leu His Arg Asp His Leu Gin Pro Asp 

5 0 5 5 60 
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Arg 


Gly 


Giu 


Pro 


Giu 


Pro 


Ala 


Gly 


Giu 


Giu 


Arg 


Ala 


Gly 


Pro 


Ala 


Pro 


65 










70 










7 5 








80 


Ala 


Gly 


Gin 


Val 


Gly 


Lys 


Leu 


Val 


Giu 


Leu 


V a _l 




His 


His 


Gin 


Leu 










85 










9C 










95 




Ala 




V a 1 


His 


Gin 


Asp 


Phe 


His 


His 


Asp 


Arg 


Gly 


Arg 


Pro 


Asp 


Arg 








100 










105 










110 






Pro 


Ala 


His 




Val 


His 


Arg 


Ala 


Giu 


His 


Arg 


Giu 


Pro 


Arg 


Ala 


Pro 






115 










120 










125 






■~ - y 


Me - 




t ro 


Pro 


Giu 


Leu 


Pro 


Asp 


Pro 


Pro 


Ala 


Arg 


Val 


Thr 


Asp 














135 










140 









■ili> 231 

2 1 ' • Artificial Seauence 



..2: • CDS 

■ 2,. ■ { \) . . . (23i: 



'■1C J> 5 7 

tec ag i go j eta cct cca gga cca gcg ctt cct ggg cat gtg ggg ctg 4S 

Ser Ar7 Ala Leu Pro Pro Gly Pro Ala Leu Pro Gly His Val Glv Leu 

1 5 1C 15 

etc eg j caa g-t gat ctg cac cac ggc cgt gec ctg gaa cgc cag ctg 96 
Leu Ar j Gin Ala Asp Leu His His Gly Arg Ala Leu Giu Arg Gin Leu 
<-0 25 30 

gag can en gia cct gag cca gat ttg gga caa cat gac ctg gat gga 14 4 

Giu Gin Gin Giu Pro Giu Pro Asp Leu Gly Gin His Asp Leu Asp Gly 
^ 4 0 4 5 

gtg gg : go? c-:a gat cag caa eta ca: eg a gat eat eta cag cct gat 192 
Val Gly Ala Arg Asp Gin Gin Leu His Arg Asp His Leu Gin Pro Asp 
5-' 55 60 

cga gga ga:i c :a gaa cca gca gga gaa gaa cga jet gga cct get cca 2 4 2 

Arg Gly Gl \ Pro Giu Pro Ala Gly Giu Giu Arg Ala Gly Pro Ala Pro 

65 70 75 80 

get gg^ ca-i gtg ggc aag ctt gtg cga ctg att gag gat cc 231 
Ala Gly Gin Val Gly Lys Leu Val * Leu lie Giu Asp 
85 90 



210 ■ 5h 

211 • 9. 
212: • PI T 

■ 212 • Artificial Sequence 

■ 4 CC > 5b 

Ser Ar,j Ala Leu Pro Pro Gly Pro Ala Leu Pro Gly His Val Glv Leu 

i 5 10 15' 

Leu Aru Gin Ala Asp Leu His His Gly Arg Ala Leu Giu Ara Gin Leu 

20 25' " 30^ 

Giu Glr. Gin Giu Pro Giu Pro Asp Lei; Gly Gin His Asu Leu Aso Glv 

3 C 40 45 

Val Gly Ala Arg Asp Gin Gin Leu His Arg Asp His Leu Gin Fro Asp 
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5 0 5 5 

Arg Gly Giu Pro Giu Pro Ala Gly Giu Giu Arq 
65 70 75' 

Ala Gly Gin Val Gly Lys Leu Val Leu He Giu 

9 5 



6C 

Ala Gly pro Ala Pro 

80 

A 3D 



:21C> 59 

::ill> 272 

• 2 I 2 ' - CNA 

:2l3> Artificial Seauence 



: 2C: 



GDS 

a; , 



1 it 



Asp Cy: 



59 

Teg 

\ia 



sec tgc lqc tga 
rhr Cys Cys 



teg tgg see 
Ser Trp Pro 
10 



gca teg tag age tqc 
Ala Ser Trp Ser Cys 
15 



4 3 



tgg 
T ro 



agt 
Ser 



Al i Civ 



act q g 3 
Thr Gly 



jug act ggg aga tec 
Ala Ala Gly Arg Ser 

20 

7cc agg age tga aga 
Ala Arg Ser * Arg 

3 5 



tga agt ac: 
25 

act ctg cag 
Thr Leu Gin 
40 



ggt gga acc tgc tec 96 
Gly Gly Thr Cys Ser 
30 

tga gee tgc tga acg 144 
+ Ala Cys * Thr 



cca 
Pro 



tgc 
Cys 



agg 
Arg 

7 5 



ccg cc-i 
Pro Fro 
4 5 

ag ; gca 
Ser Ala 

6 1 

get tcj 
Ala Ser 



teg 



t et 
Ser 



Ser 



ec5 tgg ccg agg 
Pro Trp Pro Arg 
50 

ggc gcg gca tec 
Gly Ala Ala Ser 
65 

gcg ccc tgc tgt 
Ala Pro Cys Cys 

80 



gca ccg acc 
Ala Pro Thr 



tgc aca tec 
Cys Thr Ser 



aag gat c; 
Lys Asp 



gcg tga teg agg tgg 
Ala * Ser Arg Trp 
5 5 

cca ccc gaa ttc gec 
Pro Pro Glu Phe Ala 
7 0 



192 



24C 



272 



■:210 ■ »-.0 

'.211 ■ M 

■:212 - PRT 

721.5' Artificial Sequence 

•:4JC> tO 

He Asp Cys Ala Thr Cys Cys Ser Trp Pro Ala Ser Tro Ser Cvs Trp 

1 5 10 " 15 

Ala Gly Ala Ala Gly Arg Ser Ser Thr Gly Gly Thr Cvs Ser Ser Thr 

25 ' 30 

Gly Ala Arq Lor Arg Thr Leu Gin Ala Cys Thr Pro Pro Pro Ser Pro 

35 40 ' 45 

Trp Pro Ara A: a Pro Thr Ala Ser Arg Trp Cys Ser Ala Ser Gly Ala 

5C 55 60 

Ala Ser Cys Thr Ser Pro Pro Giu Phe Ala Arq Ala Ser Ser Ala Pro 
^ ^0 •• 80 

Cys Cys Lys Asp 
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'210:> 61 

■:211:- 798 

; 2 • • dna 

•'2:3;- Artificial Sequence 

.220:- 

22: • 003 

'222 > ; 1 ; . . . i 7 98 ; 

: 4 0 2 : • 61 

ga f cag egg caa gga gat ttt ccg ccc egg egg egg cga cat gcg 

Gl i Gin Arg 31a Gly Asp Phe Pro" Pro Arg Arg Arg Arg His Ala 
5 1C 15 

ca i eta gcg rag cga get gta caa gca caa ggt ggt gaa gat cga 

Gin Lei: Ala 31 n Arg Aia Vai Gin Val Gin Gly 31y 3iu Asp Arg 

2 0 2 5 3 0 



gc : ccc ggg cat cgc :cc cac caa ggc caa gcg ccg cgt ggt gca gcg 144 

Ail Pr Gly His Arg Pro Hie Gin Gly 31c Ala Pre Arg 31y Aia Ala 

3; 40 45 

eg i ga. : gcg cgc cgt ggg cat egg cge tac get cct egg ctt cct ggg 192 

Arg Gi.i Aia Arg Arg 31 y His Arg Arg Tyr Val Pro Arg Leu Pro Gly 

5'.' 55 ' 60 

cge tgc agg ca g cac eat ggg cgc cgc cag cct gac cct gac cgt gca 24 0 

Arg Gyc Arc Gin His His Gly Arg Arg Gin Pro Asp Pro Asp Arg Ala 

So 7 0 7 5 80 

ggc ccc cca -get get gag egg cat cgt gca gca gca gaa caa cct get 2 88 

Gly ?r- Pr: Aia A La Giu Arg His Arg Aia Aia Aia Glu Gin Pro Ala 
35 n 95 

gcg eg- cat cga ggc cca gca gca cct get cea get gac cgt gtg ggg 336 

Al.i Ar.j Hi:. Arg Gly Pre Ala Ala Pro Ala Pro Aia Asp Arg Val Gly 

100 L 05 110 

cat ca. j gca get cca ggc ccg cgt get ggc tct aga gcg eta cct cca 384 

Hie Sir. Aia Aia Pro Gly Pro Arg Ala 3Ly Ser Arg Ala Leu Pro Pro 

lie ' 120 ' 125 

gga cca gcg ctt cct ggg cat gtg ggc et g etc .egg caa get gat ctg 4 32 

Gly Pro Ala Leu Pro 31y His Val Giy Leu Leu Arg Gin Ala Asp Leu 

13 1 " 1 L35 14 3 

cac cac ggc cgt gee ctg gaa cgc cag ccg gag caa caa gaa cct gag 4 30 

Hie Hie Giy Arg Aia Leu Glu Arg Gin Leu Glu Gin Gin Glu Pro Glu 

14 : 150 155 160 

cc,i gat teg gaa caa cat gac ctg gat gga gtg gga gcg cga gat cag • . •• 

Pry Asp Leu Giy Gin iis Asp Leu Asp 3Ly Val Giy Ala Arg Asp Gin 
165 * I/O ' " 175 

caa eta cac cga gat cat eta cag cct gat cga gga gag cca gaa cca 

Gin Leu His Arg Asp His Leu Gin Pro Asp Arg Giy Glu Pro Glu Pro 

13 0 16 5 19 0 
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gca gga g a a gaa cga get gga zzt get :ca get gga caa gtg ggc aag 
Ala Gly Glu 31 j Arg Aia Gly Pro Aia Pro Ala Gly 31 n Val 3iy lys 
195 ^ 2 3 D 2C5 



624 



ctt gt 7 gaa ct g gtt caa cat ca: :aa ste get gtg gta cat eaa gat 
Lei Val. Glu Lei Val Gin His His 31r. Lee Ala Val Val His Gin Asd 

21": 215 220 



ttt oa r :a: gat :gt ggg egg c:t gat egg cct gcg :at cgt gtt :ac 
Fhe Hi ; H.^s Asp Arg Gly Arg ?r 3 Asp Arg Pre Aia His Arg Val His 



23 ) 



235 



240 



720 



cgt gc gag :i: :gt ga 1 ccg egt gcg~~ cca ggg eta :ag cc: :ct gag 
Ar 7 Al : oiu Hi_s Arg Gi 1 Pro Arg Ala Pro Gly Leu 5 in Pr:> Pro Glu 
145 250 ' 255 



'63 



J ; - - - Asp 



:-ro Al -i Arg Aia Pro Arg 



i.eu 
1 

Arg 
Ala 
Arg 

Arg 

65 ' 
G 1 y 

Al.i 

K 1. s 

O-L '/ 

H 1 s 
1 4 ? 
Fro 



■ z 1 u '* 

."111 , 

■ 213 • 

■ 4 00 * 
Glu Gin 

Gin Leu 

Pro Gly 

35 

Gi l Ala 

5 0 

Cys Arg 
Pr: Pro 

Gii. Ala 
11 

Pro Aia 
13m 

His Gly 
Asr Leu 



2 bo 
PRT 

Artificial Sequence 



nl 

Arg -.-In 

5 

Ala Gin 

20 

His Arq 
Arg Arg 
Gin His 
Aia 



Ala Gly Glu 
19-. 

Leo 7a. Glu 
21 ■ 

?he His His 
225 

Are Ai^ Gle 



Gly 

Hi 

7 0 



nia 
-5 

Gly 



Arg 

I 0 0 

A.,. a Pro 



Leu Pro 

Arg Ala 

Gly Gin 
.65 
Arg Asp 
IbO 

Glu Arg 



Gly 

Leu 
15 0 
His 

His 

Ala 



-&u .'a- 



ASp Arg 
His Arg 



Gly 

2 30 



Asp Phe 

Ala Val 

His Gin 
40 

His Arg 
5 5 

Gly Arg 

Arg His 

Ala Ala 

Pro Arg 
120 
His Val 
135 

Glu Arg 

Asp Leu 

Leu Gin 

Gly Pro 
200 
His His 
215 

Arg Pro 
Pre Are 



Pro 

Glr. 
25 
Gl y 

Are 

Are 



10 = 
Ala 



Gio 
Asr.' 

195 
Aia 



Pro Arg 

10 

Val Gin 
Gin /Ala 
Tyr Val 
Gin Pro 

Aia Ala 

90 

Aia Pro 

Gly Ler 

Leu Leu 

Lee G 1 u 
155 

Gly Val Gly 

/ 0 

Asp .Arg Gly 
Pro Aia Gly 
Leu Ala 



Arg 

Gly 

Pro 

Pro 

6C 

Asp 

Ala 

Ala 

Arg 

Arg 
14 3 
Gin 



Are Pro 

235 
Pro Gly 



Val 
22 0 
Ala 

Leu 



Arg Arg 

Gly Glu 

30 
Arg Gly 
45 

Arg Leu 
Pro Asp 
Glu Gin 
Asp Arg 

11 : 

Aia Leu 
125 

Gin Ala 

Gin Glu 

Ala Arg 

Glu Pro 
190' 
;in Val 

205 

Val His 
His Arg 
Glr. Pro 



His Ala 
15 

Asp Arg 
Ala Ala 
Pro Gly 

Arg Ala 

30 

Pro Ala 
35 

Val Gly 

Pro Pro 

Asp Leu 

Pro Glu 
160 
Asp Gin 
17 5 

Glu Pro 

Gly Lys 

Gin Asp 

7a L His 

P r :> Glu 
255 
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Let; Pre Asp Pro Pre Ala Arg Ala Fro Arg 
2 6 0 2 65 

<21-y> 63 

■-■211.- 9 0 C 

^212; DNA 

-213" Artificial Sequence 

■;22Co 

2 2 2 -• i 1 ;■ . . . 9 c C 

■■4 0 0 > 5 3 

etc gag cag egg caa gga gat ttt ccg ccc egg egg egg cga cat gcg 48 

Leu Giu G i : i Arg Olr. Gly Asp Phe Pro Pre Arg Arg Arg Arg His Ala 
1 5 10 15 

cga caa etg geg cag cga get gta caa gta caa ggt ggt gaa gat cga 96 
Arg Gli; Leu Aia Gin Arg Ala Val Gin Val Gin Gly Gly Glu Asp Arg 
20 25 30 

gee cct ggg -at cgc ccc cac caa gge caa gcg ccg cgt ggt gca geg 14 4 

Ala Pro Gly His Arg Pre His Gin Gly Gin Ala Pro Arg Gly Ala Ala 
35 40 45 

cga gaa gcg cge cgt ggg cat egg cgc tat gtt cct egg ctt cct ggg 192 
Arg Glu Ala Arg Arg Gly His Arg Arg Tyr Val Pro Arg Leu Pro Gly 
50 55 60 

ccc tgc agg cag cac cat ggg cgc cgc cag cct gac cct gac cgt gca 240 
Arg Cy<j Arg Gin His His Gly Arg Arg Gin Pro Asp Pro Asp Arg Aia 
65 70 75 30 

ggc ccg cea get get gag egg cat cgt gca gca gca gaa caa cct get 2 88 

Gly Pro Pro Aia Ala 01; Arg His Arg Ala Aia All Glu Gin Pro Ala 

85 90 95 

gcg egc eat cga ggc cea gca gca cct get eca get gac cgt gtg ggg 0 3 6 

Aia Arg His Arg Gly Pro Ala Ala Pro Ala Pro Aia Asp Arg Val Gly 
100 105 ' 110 

eat caa gca get eca ggc ccg cgt get ggc tet aga gcg eta cct cea 364 
His Gin Ala Aia Pro Gly Pro Arg A La Gly Ser Arg Ala Leu Pro Pro 
115 120 " 125 

gga cea gcg ctt cct ggg cat gtg ggg ctg etc egg caa get gat etg 432 
Gly Fro Aia Leu Pro Giv His Val Gly Leu Leu Arg Gin Aia Asp Leu 
130 135 ^ 140 

cac eac ggc cgt gee etg gaa cgc cag ctg gag caa caa gaa cct gag 4 3 0 

His Kis Gly Arg Aia Leu Giu Arg Gin Leu Glu Gin Gin Glu Pro Giu 
^ 5 1 50 155 160 

cea gat t~g gga caa ea*: gac ctg gat gca gtg gga gcg cga gat cag L .G 
Fro Asp Lou Gly Gin, Hie Asp Leu Asp Gly 7a 1 Gly Ala Arg Asp Gin 
165 17 0 175 

caa eta cac cga aat eat eta cag cct gat cga gga gag eca gaa cea ■ 
G^n Leu His Arg .Asp His Leu Gin Pro Asp Arg Gly Glu Pro Giu Pro 
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ISO 



15! 



190 



gca 
Ala 



Leu 



ttt 
Phe 
T2r5 

cgt 
Arg 



ctt 
Leu 



gga gaa gaa cga get gga cct get cca 
Gly jIu 31u Ara Ala Glv Pro Ala Fro 
135 ' ' 20C 



gga caa gtg ggc 



2C: 



gt 3 gaa ctg gtt caa cat cac ;aa ctg get gtg 



vai Giu Leu 

21 ) 



215 



get gu 7 cat cgt gaa ccg cgt gcg c 
A! » G.i His Arg Glu Pro Arg Ala Pro Giy Met Gin 

2 4 5 2 



aag 
Lys 



a cat 



Val Gin His His 3in Leu Ala Val Val His 



ca*: cit .-at cgt gg j egg c =t gat egg cct gc: cat cgt 
Hi; Asp Arg G.1 / Arg Pro Asp Arg Pro Ala His Arg 

23 3 2 3 5 



ggg at 7 cag ecc 



caa 
Gla 



gtt 
Val 



cct 
Pro 

255 



gat 
Asp 



cac 
His 
24:) 

qag 
Gin 



cc.i gae ccg cct ge : cgt gtg acg gat cc 
Pro Asp Pro Pre Ala Arg Val Thr Aso 
260 " 2 65 



62 4 



672 



7 20 



pn 1 



Leu 
1 

Arg 

Ala 

Arg 

Arg 

65 

Gly 

Ala 

His 

Gly 

His 
14 5 

D ro 



210:> 
21 1> 



<:4 00> 
G 1 u Gin 

Gin Leu 

Pro Gj y 

35 

Glu Ala 
50 

Cy; ; Arg 

Pro Pro 

Arg His 

Gin Ala 
115 
Pro Ala 
130 

His Gly 
Asr 



64 

266 

PRT 

Artificial Sequence 



64 

Arg Gin 

5 

Ala Gin 

2 0 

His Arg 
Arg Arg 
Gin His 

Ala Ala 

3 5 

Arg Gly 
100 

Ala Pro 
Leu Pro 
Arg Ala 



(.-in ueu ms 



Phe 



Giy Glu 
135 
Val Glu 
210 

His His 



Leu Gly Gin 
165 
Arg Asp 

Glu Ara 



Giy Asp 

Arg Ala 

Pro His 

Gly His 
55 

His Gly 

-?o 

Glu Arg 
Pro Ala 



Leu Va 1 
Asp Arg 



Gly His 
135 
Leu Glu 

150 

His Asp 

His Leu 

Ala Gly 

Gin His 
215 
Gly Arg 

230 



Phe Pro Pro 
10 

Val Gin Vai 
25 

31 n Gly G...n 
40 

Arg Arg Tyr 
Arg Arg Gin 

His Arg Ala 

93 1 

Ala Pro Ala 
105 

Arg Ala Gly 
120 

Val Gly Leu 
Arg Gin Leu 
Leu Asp Gly 

ro 

Gin Pro Asp) 
185 

Pro Ala Pro 
200 

His Gin Leu 

Pro Asr Arc 



Arg Ar 7 Arg 

Gin Gly Gly 

Ala Pro Arg 
45 

Val Pro Arg 

60 

Pro Asp Pro 

75 

Ala Al :i Glu 

?rt) Ala Asp 

Ser Aro Ala 
1 2 5 

Leu Arc Gin 
14' 

Glu Glr. Gin 
155 

Val Giy Ala 

Arg Gly Glu 

Ala Giy Gin 

203 

Ala Val Val 

220 
Pro Ala His 

? 1 C : 



Arg His Ala 
15 

Glu Asp Arg 

30 

Gly Ala Ale 

Leu Pro Gly 

Asp Arg Aid 
30 

Gin Pro Ala 
95 

Arg Val Gly 
110 

Leu Pro Pro 

Ala Asp Let: 

Glu Pro 31u 
1 6' 

Arg Asp Glr. 
175 

Pro Glu Pro 
190 

Val Giy Lys 

His Gin Asp 

Arg Val His 
242 
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Arg Ala Glu His Arg Glu Pro Arg Ala Pre Giy Met Gin Pro Pro Gi 

24 5 2 5 0 255 

Leu Pro Asp Pro Pro Ala Arg Val Thr Asp 

9nT 2 £ 5 



<2.0> ob 

2 1 1 > 0 4 ~ 

■ ? : .. • dna 

• * 1 • • Artificial Sequence 
■-.2 2 0.> 

• : • cos 

•-..222^ ■: 1 :■ . . . i 64 7 :■ 



< 4 C u > 55 

etc gag cag egg caa gga gat ttt ccg ccc egg egg egg cga cat gcg 
Leu Glu Gin Arg Gin Gl\ Asp Phe Pre Pro Arg Arg Arg Arg His Ala 



15 



cgc tge agg cag cac cat; ggg cgc cgc cag cct gac cct gac cgt gca 
Arg Cvs Arg Gin His His Gly Arg Arq Gin Pro Asp Pro Asp Arg Ala 
65 ' 70 75 80 

ggc ccg eca get get gag egg cat cgt gca gca gca gaa caa cct get 
G L v Pro Pro Ala Ala Glu Arg His Arq Ala Ala Ala Glu Gin Pro Ala 

3 5 90 9 5 

g-g cgc cat cga ggc eca gca gca cct get eca get gac cgt gtg ggg 
Ala Ara His Arg Giy Pro Ala Ala Pro Ala Pro Ala Asp Arg Val Gly 

ICC 105 HQ 

eat caa gca get eca gg : ccg cgt get ggc tet aga gcg eta cct eca 
His Gin Ala Ala Pro Gly Pro Arg Ala Giy Ser Arg Ala Leu Pro Pro 
115 120 125 

gga eca gcg ctt eet ggg cat gtg ggg etg etc egg caa get gat r:tg 
Giv Pro Ala Leu Pro Gly His Val Gly Leu Leu Ar '5 Gln Ala As P Leu 
130 135 14 0 

cac cac ggc cgt gee etg gaa cgc caz eta gag caa caa gaa ect gag 
His His Giv Arg Ala Leu Glu Ara Gin Leu Glu Gin Gin Giu Pro Glu 
145 ' " 150 155 160 

eca gat ttg gga caa cat gac etg gar gga gtg gga gcg cga gat cag 
Fro Asd Leu Giv Gin His Asc Leu Aso Gly Val Gly Ala Arg Asp Gin 
' ' 17 5 



4? 



96 



144 



cga caa :~ : gcg cag cga get gta caa gta caa ggt ggt gaa gat cga 

Arg Gin L-u Ala Gm Arc Ala Val Gin Val Gin Gly Giy Glu Asp Arg 

20 ' 2- 30 

gee cct ggg cat cgc ccc cac caa ggc caa gcg ccg cgt ggt gca gcg 

Ala Pro Giv His Arg Fro His Gin Giy Gin Ala Pro Arg Gly Ala Ala 

35 4 0 4 5 

cga gaa gcg cgc cgt ggg cat egg cgc tat gtt cct egg ctt cct ggg 192 

Arg Glu Ala Arg Arg Gly His Arg Arg Tyr Val Pro Arg Leu Pro Gly 

50 55 60 



240 



23? 
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caa eta cac cga gat cat eta cag cct gat cga gga gag cca gaa cca 576 

Gin Leu His Arg Asp His Leu Gin Fro Asp Arg Gly Glu Pro Glu Pro 

180 IS 5 190 

gca gga gaa gaa cga get gga cct get cca get gga caa gtg ggc aag 624 

Ala Gly Glu Glu Arg Ala Gly Pro Ala Pro Ala Gly Gin Vai. Gly Lys 

135 2 00 20 5 



:tt gt} tga 
.e^ 7a 1 



:tg att 
,eu lie 



gag gat 
Giu Asp 



cc 



64 7 



210> 66 
211: 214 
212> PRT 

213:- Artificial Sequence 



:4X:> 66 



Leu 


Glu 


Sir. 


Arg 


Gin 


;iy 


Asp 


Phe 


Pro 


Pro 




A r c 


Arg 


Arg 


His 


Ala 




















10 










15 




Arg 


' j i n 


Leu 


Ala 
20 


Gin 


Arg 


Ala 


Val 


Gin 
25 


Vai 


Gin 


Gly 


Gly 


Giu 
30 


Asp 


Arg 


Ala 


Pro 


Gly 
35 


Hi 3 


Arg 


Pro 


His 


Gin 
40 


Gly 


Gin 


Ala 


Pro 


Arg 
4 5 


Gly 


Ala 


Ala 


Arg 


Glu 

50 


Ala 


Arq 


Arg 


Gly 


His 
55 


Arg 


Arg 


Tyr 


Val 


Pro 

60 


Arg 


Leu 


Pro 


Gly 


Arg 


Cys 


Arg 


Gin 


His 


His 


Gly Arg 


Arg 


Gin 


Pro 


Asp 


Pro 


Asp 


Arg 


Ala 


65 










70 










75 










80 


Gly 


Pr j 


Pro 


Ala 


Ala 
85 


Glu 


Arg 


His 


Arg 


Ala 
90 


Ala 


Ala 


Glu 


Gin 


Pro 
95 


Ala 


Ala 


Ar ? 


Hi;-.; 


Arg 

100 


Gly 


Pro 


Ala 


Ala 


Pro 
105 


Ala 


Pro 


Ala 


Asp 


Arg 
110 


Val 


Gly 


His 


Gin 


Ala 
115 


Ali 


Pro 


Gly 


Pro 


Arg 
120 


Ala 


Gly 


Ser 


Arg 


Ala 
125 


Leu 


Pro 


Pro 


Gly 


Pr 3 
130 


Al a 


Leu 


Pro 


Sly 


His 
135 


Val 


Gly 


Leu 


Leu 


Arg 
14 0 


3ln 


Ala 


Asp 


Leu 


His 


Hi.s 


Gly 


Arg 


Ala 


Leu 


Glu 


Arg 


Gin 


Leu 


Glu 


Gin 


Gin 


Glu 


Pro 


G 1 u 


145 










150 










155 










160 


Pro 


Asp 


Leu 


Gly 


Gin 
165 


His 


Asp 


Leu 


Asp 


Gly 
17C 


V a 1 


Gly 


Ala 


Arg 


Asp 
175 


G 1 n 


Gin 


Le j 


Hi ^ 


Arg 

180 


Asp 


His 


Leu 


Gin 


Pro 
185 


Asp 


Arg 


G j. y 


3iu 


Pro 

190 


Glu 


Pro 


Ala 


Gly 


Glu 
19 j 


Glu 


Arg 


Ala 


Gly 


Pro 

200 


Ala 


Pro 


Ala 


Gly 


Gin 
205 


Vai 


G ± y 


Lys 


Leu 


Val 
21 3 


Leu 


lie 


Glu 


Asp 























.::!(:•• 67 

:21i ■ 1918 

.212 ■ DNA 

-:213 - Artificial Sequence 



; ;22C ■ 

■-22 1 * CDS 

<222 • { 1 )...■: 1918) 



<400.> 6~ 
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37 



\la Ser Ala Ala Asp Arg Lea Trr Val Thr 

5 iC 



/ai Tyr Tyr Giy Vai Fro 
15 



gtg tgg aag gac 
Val Trp L/s Asp 

20 



ace acc acc erg t:c tgc gee age gac gc: aag 
Thr Tnr Thr Leu ?he Cys Ala Ser Asp Ala Lys 



25 



.10 



gec tac gac a :c gag gtg :ac aa; gtg tgg gee ase cac gcg :g: gt g 



Ala Tyr Asp T.ir 

1 z. 



His Asn Val Trr 
4 j 



Tr.r His Ala Cys Va L 
4 5 



144 



etc acc gac c :c aac ccc cag gag gtg gtg ctg ggc aa: gtg a;: gag 132 



Thr Asp Pro Asn 

50 



Gin giu Va-i--Va-l Leu 
55 



Asn Val Tnr Gla 



a a z 
As n 



aac atg 
As.U Met 



aag a 32 aa: atg gtg gag :a g atg :ac gag gat 



L y s A s n Asn Me t V a 1 

70 



Aso 



2 4 0 



d 

He 



ate age etg tgg gac cag ag: ctg aag ccc tgc gtg aag ct j ac : 



:-3r Leu Trp Asp Gin Ser Leu Lys 

3 5 SO 



Val 



Le i Thr 

3 5 



28 3 



ccz ctg tgs gtg a:: ctg aa: tg: acc aag ctg aag aac age ac: gac 33 6 

Pro Leu Cys Val Thr Leu Asn Cys Thr Lys Le j Lys Asn Ser Thr Asp 



100 



105 



1 10 



acc aac aac acc eg: tgg ggc ac: cag gag atg aag aac tgc age ttc 
Thr Asn Asn Thr Arg Trp GLy Thr Gin Glu Met Lys Asn Cys Ser Phe 



115 



120 



125 



384 



aac ate age acc age gtg cgc aac aag atg aag eg: gag tae go: ctg 
Asn He Ser Tnr Ser Val Arg Asn Lys Met Lys Arg Glu Tyr Ala Leu 



13C 



135 



1-1 J 



432 



ttc tae age ctg gac ate gtg ccc ate gac aa : gac aac acc age tac 
Phe Tyr Ser Leu Asp lie Val Pro He Asp Asn Asp Asn Thr Ser Tyr 
14 5 150 155 160 



4 3u 



ccc ctg cgc age 
.Arg Leu Arc Ser 



- -3 - 

' ^ y s 
16 5 



aac aca tc j ate ate ac: cag gee tgc cc : aag 
Asn Tnr Ser He He Thr Gin Ala Cvs Pro Lvs 



17- 



7 7 s 



:tg age ttc gig ccc ate ccc at: cac ttc tg: gc: ccc gee gg : tt : 



Val Ser Phe Glu Pro He Pro He His Phe 



A L a Pro Ala 



Phe 



gee ate ctg aag tgc aac aac aa 5 acc ttc aa: ggc acc ggc c:: tg: 
Ala He Leu Lys Cys Asn Asn Lys Thr Phe Asn GLy Thr Gly Pro Cys 



1 25 



20 ) 



20 5 



acc aac gtg age acc gtg cag tg: acc cac gga act cgc sec gtg gtg 



Asn a 1 oer 
2 i 



Thr Vai Gin Cys 
215 



GLy L le Arg Pro Va i Va i 

22C 



age acc cag ctg 
Co*- Th- 



in Leu Leu Leu Asn 

23C 



etg aac gg: age ctg 
ti ^ 



gc: gag gag gag gtg gt 3 

er Leu Ala Glu Glu Glu Va^ Vai 
235 24 0 
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ate aaa tct gag aac ttc ace aac aac gec aag acc ate ate gtg cag 768 

He Arg Ser 31j Asp. Phe Thr Asn Asn Ala Lys Thr He He Val Gin 

245 25 C 255 

etg aa: gag age gtg gag ate aac tgc see cgc ece aae sac aac ace 316 

Leu Asn Hu Ser Val GLu lie Asn Cys Tr.r Arg Pre Asn Asn Asn Thr 

260 265 27i3 

cgc aag age ate eac ate gge cct ggc ege g:e tte tac acc ace ggc 86-3 

Arg Lys Ser lie .4 is lie Gly Pre Hy Arg A L 3 Phe Tyr Thr Thr Gly 

275 280 285 

tratc ate ate ggc gac ate ege cag get- eac tgc aa: ate tct aga aee 012 

Asp He lie 31 y Asp He Arg Gin Ala His Cys Asn He 3er Arg Thr 

293 295 ' 330 

aae tgg ace aac a ;c etg aag cgc gtg g :c gag aag etg egc gag aag 96 C 

Asn Trp Tr.r Asn Tnr Leu Lys Arg Val Ala GLu Lys Leu Arg Glu L / 3 

30 5 31 D 3H 320 

etc aac aae acc a :c ate gtg ttc aac cag age tc: ggc ggc gac ccc IOC 3 

Phe Asn Asn Thr Tnr He Val Phe Asn Gin Ser Ser Gly Gly Asp Pro 

325 330 ' 335 

gag ate gtg atg ;ac age ttc aac tgc gge gge gag ttc tte tac tgc 1056 

Glu lie Val Met HLs Ser Phe Asn Gys Gly Gly Glu Phe Phe Tyr Cys 

340 345 350 

aae acc acc cag ctg tte aac age ace tgg aae gag acc aac age gag 110 4 

Asn Tnr Thr G.ln Leu Phe Asn Ser Thr Trp Asn Glu Thr Asn Ser Glu 

355 360 * 365 

ggc aac ate act agt gge ace ate acc etg ccc tgc cgc ate aag cag 1152 

Gly Asn lie Thr Ser Gly Thr He Thr Leu Pro Gys Arg lie Lys Gin 

370 375 33) 

itc ate aac atg tgg cag gag gtg ggc aag gee atg tac gec ccc ccc 12 0 0 

Lie He Asn Met Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro ?rj 

38 5 390 3 35 4 00 

ate cgc ggc cag ate aag tge ctg age aae ate aee gge ctg ctg ctg 1243 

He Gly Gly Gin He Lys Gys Leu Ser Asn He Thr Gly Leu Leu Leu 

4 05 4 10 4 15 

acc cgc gac ggc ggc age gae aac teg age age ggc aag gag att ttc 1296 

Thr Arg Asp Gly Gly Ser Asp Asn Ser Ser Ser Gly Lys Glu He Pne 

420 425 4 30 

:gc ccc ggc ggc ggc gae atg cgc gac aac tgg ege age gag ctg tac 134 4 

Arg Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Arg Ser Glu Leu T/r 

335 440 445 

aag tec aag gtg gtg aag ate gag ece ctg ggc ate gec ece acc aag 1 - 

Lys T/r Lys Val Val Lys He Glu Pre Leu Gly He Ala Pre Thr Lys 

4d0 455 4 60 

gec aag cgc cgc gtg gtg cag cgc gag aag egc gee gtg ggc ate ggc ; ; 

Ala Lys Arg Arg Val Val Gin Arg Glu Lys Arg Ala Val Gly He G*y 

165 470 4 75 4 80 
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get atg ttc 
Ala Met Phe 



etc ggc 
Leu Gly 
4 35 



ttr ctg ggc get 
Fhe leu Hv Ala A 



gca ggc age ace atg ggc gec 
Ala Gly Ser Tnr Met 31y Ala 

4 95 



14 88 



gec age etg 
Ala Ser leu 



ace ctg ac: gtg cag gc: 
Thr Leu Thr Val Gin Ala 

5C0 5C5 



cge cag ctg etg age gge ate 
Arg Gin leu lea Ser Hy lie 

510 



1536 



gtg eag rag 
Val Gin 2in 

515 



Hn Asn A 



aa: ctg rtg rgc 
Asn Leu Leu Arg 
520 



gee at;: gag grr cag rag cac 
Ala He Glu Ala Gin Gin His 

-> ^ 



1584 



etg etc rag rtg are gt ; tgg ggc ate aag eag ctr 
Leu Leu Gin leu Tnr Val Trp Gly He lys GH Lea 
5 3 j 535 54 3 



cag gee rgc gtg 
Gin Ala Arg Val 



163: 



etg gc: rt i 
Leu Al i le i 
545 



:ag cgc 
j In Ara 



t a : 



rag gao 



Tyr Leu Gin Asp 

5 50 



eag car * - : :t ; ggc it-; tgg 
Gin Arc Phe leu Gly Met Trp 

55 5 560 



168C 



ggc tgc te 
Gly Cys Ser 



jgc aag 
jly lys 
565 



ctg ate tgc acc 
Leu He Cys Thr 



a eg 
Thr 

57 0 



gco gtg eee tgg aa r gee 
Ala Val Pro Trp Asn Ala 
57 5 



1728 



age tgg age aac aag aae ctg age cag att tgg gae 
Ser Trp Ser Asn Lys Asn Leu Ser Gin He Trp Asp 
580 585 



aac atg acc tgg 
Asn Met Thr Trp 

590 



1776 



atg gaa tng gag cgc gag ate age aac tac acc gag 
Met Glu Trp Glu Arg Glu He Ser Asn Tyr Thr Glu 
5 95 600 



ate ate tar age 
He He Tyr Ser 

60 5 



1824 



ctg ate gag gag age cag aac cag cag gag aac aac 
leu lit- GH Glu Ser Gin Asn Gin Gin Glu Lys Asn 
61'"' 615 620 



gag ctg gae ctg 
> : u Leu Asp Leu 



1872 



etc ran ctg 
leu Glr. leu 
625 



gae: aag tgg gca age ttg 
Asp Lys Trp Ala Ser leu 
630 



tgt gae tga ttg agg at r 
Cys Asp * L« Hi Arg He 

635 



1918 



0 v 68 



<:2H 
• 2H 
•2H 



658 
PRT 

Artificial Sequence 



■:4 0C> 68 

Ala Ser Ala Ala Asp Arg Leu Trp Val Thr Val Tyr Tyr Gly 7a L Pro 

1 n 10 15 

Val Trp Lys Asp Ala Thr Thr Thr Leu Phe Cys Ala Ser Asp AI 3 Lys 

2 5 2 5 3 0 

Ala Tyr Af p Thr Glu Val His Asn Val Trp Ala Thr Hs Ala Cys Val 

•• 4 0 ; - 



Pro Thr A; p Pro Asn Pro Gin Glu Va 

50 55 
Asn Phe A;n Met Gly Lys Asn Asn Met Va 

65 ^ 
lie lie Ser Leu Trc A 



al leu Gly Asn Val Tnr Glu 
60"' 

lu Gin Met His Glu Asp 
'5 8 0 



Gin Ser Leu Ivs Pro Cys Val lys leu 
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t: 90 b*b 

Pro Leu Cys Vai Thr Leu Asn Cys Thr Lys Leu Lys Asp. Ser Thr Asp 

IOC 105 110 

Thr Asn Asr Thr Arg Trp Gly Thr Gin Glu Me* Lys Asn Cys Ser ?he 

115 120 125 

As.- lie Ser Thr Ser 7a I Arg Asn Lys Met Lys Arg Glu Tyr Ala Leu 

130 135 ' ' 140 

Phe Tyr Ser Leu Asp lie Vai Pre- He Asp Asr. Asp Asr. Thr Ser Tyr 
145 150 155 1 1" C 

Arg Leu Arg Ser Cys Asn Thr Ser lie He Thr Gin A^a Cys Pro Lys 

1 ^ 5 17 0 17 5 

Vai Ser Phe Glu i : c lie Pro He His ?he Cys AH Pro Ala Gly Phe 

180 iH5~ ' 190 

Ala He Leu Lys Cvs Asn Asn Lys Thr Phe Asr. Gly Thr Gly Pre Cy< 

195 ' 20C 105 

Thr Asn Vai Ser Thr Vai Gin Cys Thr His Gly He Arg Pro Vai 7a] 

21 C 015 " * 20 • 

Ser Thr Gin Leu L ; : u Leu Asn G_y Ser Leu Ala Giu Glu Glu Vai Vai 
2 25 2 30 205 24 C 

lie Arc Ser Giu A:; r: Phe Thr Asr. Asn Ala Lys Thr I A e He Vai Gin 

2; 4 5 250 ' 255 

Leu Asr. Glu Ser Vai Glu lie Asr. Cys Thr Arg Pro Asn Asr. Asn Thr 

2 6 0 2 6 5 2 7 Ci 

Arg Lys Ser lie His lie Gly Pre- Gly Arg Ala Phe Tyr Thr Thr Gly 

27: 280 28 5 

Asp He He Gly Asp lie Arg Gin Ala His Cys Asn He Ser Arc Thr 

2 90 " 2 95 300 

Asn Trp Thr Asn Thr Leu Lys Arg Vai Aia Giu Lys Leu Arg Giu Lys 
305 310 " 315 32.0 

Phe Asn Asn Thr Thr lie Vai Phe Asn Gin Ser Ser Giy Gly Asp Pro 

025 330 335 

Glu He Vai Met: His Ser Phe Asn Cys Giy Gly Giu Phe Phe Tyr Oys 

34C 345 350 

Asn Thr Thr Gin Leu Phe Asn Ser Thr Trp Asn Glu Thr Asn Ser Giu 

35 5 3 60 3 6 5. 

Gly Asn He Thr Ser Gly Thr He Thr Leu Pre lys hrq He Lys Gin 

3~0 37 5 380 

lie: He Asr: Met Trp Gin Glu Vai Gly Lys Ala Met: Tyr Ala Pr:> Pro 
385 390 395 4L0 

He G 1 y Giy Gin He Lys Cys Leu Ser Asn He Thr Giy Leu Le - Leu 

405 410 415 

Thr Arg Asp Gly Gly Ser Asp Asn Ser Ser Ser Gly Lys Glu He Phe 

420 ' 42 5 4 30 

Arg Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Ar j Ser Giu Leu Tyr 

4 3 5) 4 4 0 4 4 5) 

Lys Tyr Lys Vai 'Hi Lys He Glu Pro Leu Gly lie Ala Pro Thr Lys 

4 50 455 4 60 

Ala Lys Arc? Arg Vai Vai Gin Arg Giu Lys Arg Aia Vai Gly lie Giy 

4 6 5 " 4 70 47 5 4H 
Ala Met. Fne Leu Gly Phe Leu Gly Ala Ala Gly Ser Thr Met Gly Ala 

4 35 ' 4 00 4 95 

Ala Ser Leu Thr Leu Thr Vai Sin Ala Ar: Gin Le:: Leu Ser Gly lie 

500 50 5 510 

Vai Gin Gin Gin Asn Asn Leu Leu Arq .Ha He Glu Ala Gin Gin His 

515 520 525 

Leu Leu Gin Leu Thr Vai Trp Gly He Lys Gin Leu Glr. Aia Arg V2 

530 535 " ' 540 

Leu Aia Leu Giu Arg Tyr Leu Gin Asp Gin Arg Phe Leu Gly Met Trp 

5 4 5 5 5 0 5 5 5 5 6 0 
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Gly C-jS 3er Gly Lys Leu He Cys Thr Thr Ala Val Pro Trp Asa Ala 

565 57 0 ' 57 5 

Ser Trp 3er Asa Lys Asn Leu 3er Gin He Trp Asp Asn Met Thr Trp 

56 0 58 5 59 3 

Met Glu Trp Glu Arg Gla He Ser Asn Tyr Thr Glu He He Tyr Ser 

5 3". 600 635 

Leu He Hu Glu Ser Gin Asn Gin Gin Glu Lys Asn Glu Leu Asp Leu 

610 615 620 

Leu Gl.n L>m.i Asp Lys Trp Ala Ser Leu Cys Asp Leu Arg He 
625 630 635 

• 21 ,• 69 

• ;. : • • 207i 

dna 

*:2H;- Artificial Seauence 



• 21 : - ccs 

-:221;- : 1 ; ... (2071 ) 
<400> 69 

get age geg qcc gac cgc ctg tgg gtg acc gtg ta: tac ggc gtg ccc 4 8 

Ala Ser Ala Ala Asp Arg Leu Trp Val Thr Val Tyr Tyr Gly Val Pro 
I 5 10 15 - 

gtg tgg aag gac gec acc acc acc ctg ttc tgc gec age gac gec aag 96 

Val Trp Lys Asp Ala Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys 
20 25 30 

ges ta: gac acc gag gtg cac aac gtg tgg gec ac; cac gcg tgc gtg 144 

Ala Tyr Asp Thr Glu Val His Asn Val Trp Ala Thr His Ala Cys Val 

3 5 4 0 4 5 

cc: ac: gac ccc aac cc: cag gag gtg gtg ctg gg : aac gtg ac: gag 192 

Pro Thr Asp Pro Asn Pro Gin Glu Val Val Leu Gly Asn Val Thr Glu 

50 55 60 

aac tt: aac atg ggc aag aac aac atg gtg gag cag atg cac gag gat 240 

Asn Phe Asn Met Gly Lys Asn Asn Met Val Glu Gin Met His Glu Asp 

65 70 75 60 

ate at j age ctg tgg gac cag age ctg aag ccc :g: gtg aag ctg acc 286 

He lie Ser Leu Trp Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr 

85 90 ^ 95 

ccc ctg t. gc gtg a :c ctg aac tgc acc aag ctg aag aac age acc gac 3 36 

Pro Leu Cys Val Thr Leu Asn Cys Thr Lys Leu Lys Asn Ser Thr Asp 

100 105 * ' 110 

acc aac aac ac: cgc tgg ggc acc cag gag atg aag aac tgc ag: ttc 36 4 

Thr Asn Asn Thr Arg Trp Gly Thr Gin Glu Met Lys Asn Cys Ser Phe 

115 120 125 

aac at: age a:: age gtg cgc aac aag atg aag cgc gag tac gec ctg 4 :2 

Asn He ier Thr Ser Val Arg Asn Lys Met Lvs Arg Glu Tyr Ala Leu 

130 135 ' 140 

ttc tac age ctg gac ate gtg ccc ate gac aac gac aac acc age tac l-~ 

Phe Tyr Ser Leu Asp lie Val Pro lie Asp Asn Asp Asn Thr Ser Tyr 
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15C 155 160 

cgc ctg cgc age tgc aac aca teg ate ate ace eag gee tge cce aag 52 S 

Arg Leu Arg Ser Cys Asn Thr 3er He He Thr Gin Ala Cys Pro Lys 

165 170 17 5 

gtg age ttc gag ccc ate ccc ate eac ttc tgc gec ccc gec gge ttc r"6 

Val Ser Fr.e Slu Fro lie Pro He His ?ne Oys Aia Pro Ala Gly ?he 

ISO 185 * 190 

gee ate etg aag tge aae aac aag ace tte aac ggc ace ggc cce tge 624 

Ala lie Leu Lys Cys Asn Asn Ly3 Thr Phe Asn Gly Thr Gly Pro Cys 

195 200 ' 205 

ace aac gtg age acc gtg cag tge ice cac gga att cgc eec gtg gt g 6 72 

Thr Asn Val 3er Thr Val Gin Cys Thr His Hy lie Arg Pre Val Val 

2 10 215 '?.?.:> 

age aec eag :tg ctg etg aac ggc age etg ice gag gag gag gtg gtg 720 

Ser Ter Gin Leu Leu Leu Asn Gly Ser Leu Ala GIj G^u Slu Val Val 

225 230 235 240 

ate aga cct gag aac ttc acc aac aac gee aag ace ate ate gtg eag 766 

He Arg Ser Glu Asn Phe Thr Asn Asn Ala Lys Thr lie He Val Gin 

245 250 255 

ctg aac gag age gtg gag ate aac tgc acc eg: ccz aac aac aac acc 816 

Leu Asn Glu Ser Val Glu lie Asn Cys Thr Arg Pro Asn Asn Asn Thr 

260 265 270 

cgc aag age ate cac ate ggc cct ggc cgc gec tte tac acc acc ggc 864 

Arg Lye Ser He His He Gly Pro Gly Arg Ala Fhe Tyr Thr Thr Gly 

275 280 " 285 

gae ate ate ggc gac ate cgc cag gec cac :g: aac ate tct aga acc 912 

Asp I U; He Gly Asp He Arg Gin Ala His Cvs Asn He Ser Arg Thr 

2?0 295 ' 300 

aac tgc acc aac acc ctg aag cgc gtg gee gag aag ctg cgc gag aag 960 

Asn T.rp Thr Asn Thr Leu Lys Arg Val Ala Glu Lvs Leu Arg Giu Lvs 

305 310 31 5 ' '' 320 

ttc aac aac acc acc ate gtg ttc aac cag age tec ggc ggc gac ccc 100S 

Phe Asn Asn Thr Thr lie Val Phe Asn Gin Ser Ser Gly Gly Asp Pre 

325 330 J 335 

gag ate gtg atg cac age ttc aac tgc ggc ggc gag ttc ttc tac tgc He- 

Glu He Val Met His Ser Phe Asn Cys Gly Gly Giu Phe ?he Tyr Cys 

340 345 350 

aac ace acc cag ctg tte aac age acc tgg aae gag ace aac age gag H ' \ 

Asn Thr Thr Gin Leu Phe Asn Ser Thr Trp Asn Giu Thr Asn Ser Giu 

355 360 * 365 

ggc aae ate act agt ggc acc ate ace ote; cce tac cgc ate aag eag 

Gly Asr. He Thr Ser Gly Thr He Thr Leu Fro Cys Arc He Lys Gin 

3"C 375 380 

ate ate aae atg tgc cag gag gtg gge aag ccc ate tac gee cce cce 
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lie He Asn Met Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pro 
385 390 ' 395 400 



ggc ggc caq ate aag tgc ctg age sac ate ace gee ctg ctg ctg 1248 
Gly Gly Gin lie Lys Cys Leu Ser Asn He Thr Gly Leu Leu Leu 



4 10 



4 15 



acc -gc gae ggc ggc age gac aac teg age age ggc aag gag att ttc 
Thr Arg Asp Gly Gly Ser Asp Asn Ser Ser Ser Gly Lys Glu He ?he 
420 425 " ' 430 



12 9 6 



egc cee ggc ggc ggc gac atg egc gac aac tgg cge age gag ctg tac 
TTTg Fid Gly Gly Gly Asp Met Arg Asp* Asn Trp Arg Ser Glu Leu Tyr 
435 440 44 5 



1 3 4 4 



aag tac aag gtg gtg aag ate gag ccc ctg ggc ate gec cce acc aag 
Lys Tyr Lys Val VH Lys lie Glu Pre Leu Gly lie Aia Pro Thr Lvs 
4 50 4 55 " 4 6C 



13 92 



gec aag egc cge gtg gtg cag egc gag aag egc gec gtg ggc ate ggc 
Ala Lys Arq Arg Val Val flirt Arg Glu Lys Arg Ala Val Gly He 31 v 
4 65 4 70 47 c ' 4 B 0 



1440 



get atg ttc etc ggc ttc ctg ggc get gca ggc age acc atg ggc gec 14-68 
Ala Met ?he Leu Gly Phe Leu Gly Ala Ala 31y Ser Thr Met Gly Ala 
4 85 493 4 95 



gec: age ctg ace ctg acc gtg cag gc: cge sag ctg ctg age ggc ate 
Ala Ser Leu Thr Leu Thr Val Gin Ala Arg Gin Leu Leu Ser Gly lie 
500 505 510 



1536 



gtg cag cag cag aac aac ctg ctg eg 2 gee ate gag gec cag cag eae 
Val Gin Sin Gin Asn Asn Leu Leu Arg Ala He Glu Ala Gin Gin His 
515 520 525 



158 4' 



etg etc cag ctg acc gtg tgg ggc ate aag cag etc sag gec egc gtg 

Lee. Leu Gin Leu Thr Val Trp Gly He Lys Gin Leu Gin Aia Arg Val 

530 535 ' 540 

ctg get eta gag cge tac etc cag gae cag cge ttc ctg ggc atg tgg 

Leu Ala Leu Glu Arg Tyr Leu Gin Asp Gin Arg Phe Leu Gly Met Trp 

5 4 = 550 555 " 560 

ggc tgc tec ggc aag ctg ate tgc ace acg gec gtg cce tgg aac gec 

Gly Cys Ser Gly Lys Leu He Cys Thr Thr Ala Val Pro Trp Asn Ala 

5 6 5 5 7 0 5 7 5 

age tgg age aac aag aac ctg age zaq att tgg gae aac atg acc tgg 

Ser Trp Ser Asn Lys Asn Leu Ser Gin He Trp Asp Asn Met Thr Trp 

580 585 * ' 590 

atg gac tgg gag cge gag ate age aac tac acc cag ate ate tac age 

Met Glu Trp Glu Arc Glu He Ser Asn Tyr Thr Glu He He Tyr Ser 

5 95 6C0 ' -0 5 



ctg ate aag gag ace cag aac cag cag gag aag aac gag ctg gac ctg 
Leu lie Glu Glu Ser Gin Asn Gin Gin Glu Lys Asn Glu Leu Asd Leu 
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etc cag ctg gae aag tgg gca age ttg tgg aac egg tzz aac ate acc 
Leu Gin Lej Asp Lys Trp Ala Ser Leu Trp Asn Trp Phe Asn lie Thr 
625 63C 635 640 



1920 



aac t g j 



ct 3 tgg tac ate aag att ttc ate atg ate gtg ggc ggc ctg 
Le j Trp Tyr lie Lys He Phe :ie Met lie Va 1 Giy Gly Leu 
645 650 655 



1963 



it .: gg : 
. ! e -j j_ ,* 



ct j cgc ate gtg ttc acc gtg ctg a jc ate gtg aac cgc gtg 
Lei Arg He Vai Phe Thr Va 1 Leu Ser He Val Asn Arg Val 
660 665 670 



2016 



cgc ca; gg a tgc age ccc ctg age ttc cag acc cgc ctg ccc gtg tga 
Arg Gin Giy Cys Ser Pro Leu Ser Phe Gin Thr Arg Leu Pro Val 
67^ 680 ' 6S5 



2 0 64 



egg at : 
Arc 



207 1 



212 
213 



7 0 

689 

FRT 

Artificial Sequence 



:400> 70 



Ala 


Ser 


Ala 


Ala 


Asp 


Arg 


Leu 


Trp 


Val 


Thr 


Val 


Tyr 


Tyr 


Giy 


Val 


Pro 


1 








5* 










10 










15 




Val 


Trp 


Lys 


Asp 


Ala 


Thr 


Thr 


Thr 


Leu 


Phe 


Cys 


Ala 


Ser 


Asp 


Ala 


Lys 








20^ 










25 










30 




Ala 


Tyr 


Asp 


Thr 


Giu 


Val 


His 


Asn 


Vai 


Trp 


Aia 


Thr 


His 


Ala 


Cys 


Val 






35 










40 










45 






Pre 


Thr 

50 


Asp 


Pro 


Asn 


Pro 


Gin 
55 


Giu 


Val 


Val 


Leu 


G-y 

60 


Asn 


Vai 


Thr 


Giu 


Asn 


Phe 


As:: 


Met 


Giy 


Lys 


Asn 


Asn 


Met 


Val 


G 1 u 


Gin 


Met 


His 


Giu 


Asp 


65 










70 










-7 r 










so' 


lie 


lie 


Ser 


Leu 


Trp 
35^ 


Asp 


Gin 


Ser 


Leu 


Lys 

90 


Pro 


Zys 


Val 


Lys 


Leu 

95 


Thr 


Pro 


Leu 


Cys 


Val 


Thr 


Leu 


Asn 


Cys 


Thr 


Lys 


Leu 


Lys 


Asn 


Ser 


Thr 


Asp 








100 










105 










110 




Thr 


Asn 


Asn 
11?. 


Thr 


Arg 


Trp 


Gly 


Thr 

120 


Gin 


Giu 


Met 


Lys 


Asn 
125 


Cys 


Ser 


Phe 


Asn 


He 
130 


Ser 


Thr 


Ser 


Val 


Arg 

135 


Asn 


Lys 


Met 


Lys 


Arg 
140 


Giu 


Tyr 


Ala 


Leu 


Phe 


Tyr 


Ser 


Leu 


Asp 


lie 


Val 


Pro 


H e 


Asp 


Asn 


Asp 


Asn 


Thr 


Ser 


Tyr 


145 










150 










1 : 5 










160 


Arg 


Leu 


Arg 


Ser 


Gys 
165 


Asn 


Thr 


Ser 


He 


He 
170 


Thr 


Gin 


Aia 


Cys 


Pro 
175 


Lys 


Val 


Ser 


Phe 


Giu 
180 


Pro 


He 


Pro 


He 


H; s 
1 £• ^ 


Ph e 


Cys 


Ala 


Pro 


Ala 
190 


Gly 


Phe 


Ala 


:n 


Leu 


Lys 


Cys 


Asn 


Asn 


Lys 

200 


Thr 


Phe 


Asn 


Gly 


Thr 

205 


Gly 


Pro 


Cys 


Thr 


Asn 
2 1 0 


Va 1 


Ser 


Thr 


Val 


Gin 
215 


Cys 


Tt" r 


H l s 


G 1 y 


He 
220 


Arg 


Pro 


Vai 


Val 


Ser 


Thr 


G 1 n 


Leu 


Leu 


Leu 


Asn 


Giy 


Ser 


Leu 


Aia 


Giu 


Giu 


Giu 


Val 


Vai 


-> "> ^ 










230 










235 










240 


He 


Arc 






24 5 


Phe 


Thr 


Asn 


Asn 


Ala 

250 


Lys 


Thr 


He 


He 


Vai 
255 


G i. n 
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Leu Asr. Glu Ser Val Glu lie Asn Cvs Thr Arg Fro Asn Asn Asp. Thr 

260 265 270 

Arg Lys Ser He His lie Giy Pro Giv Arg Ala Fhe Tyr Thr Thr Giv 

2^5 280 ' 235 

Asp He He Giy Asp lie Arg Gin Ala His Cys A sr. He Ser Arg Thr 

290 295 " 300 

Asn Trp Tnr Asn Thr Leu Lvs Arg Val Ala Glu Lvs Leu Arg Glu Lvs 
305 310 315 ' 32:: 

Phe Asn Asr. Thr Thr He Val Phe Asn Gin Ser Ser Giy Giy Asc Fro 

325 330 335 

Glu He Val Met His Ser Phe Asn Cys Giy Giy Glu Phe Phe Tyr Cv. : , 
^ 34 0 34 5 " ' 350 

Thr Tn - Gin Leu Phe Asn Ser Ch~f ~ Trp Asn Glu Thr Asn Ser Glu 
355 360 365 

Gi / Asn He Tnr Ser Giy Thr He Thr Leu Pro Cys Arg He Lvs Glr. 

3^0 375 380 

He 1H Asn Me': Trp Gin Glu Val Giy Lys Ala Met Tyr Ala Pro Pro 
38'= 39G 395 40C 

3 - V Gi V Gi:: He Lys Cys Leu Ser Asn He Thr Glv Leu Leu Leu 
4C5 410 ^ 415 

Thr Arg Asp Sly Giy Ser Asp Asn Ser Ser Ser Giy Lys Glu He Phe 

4 20 4 25 J 4 30 

Arq Pro Giy Giy Giy Asp Met Arg Asp Asn Trp Arg Ser Glu Leu Tyr 

435 440 445 

Lys Tyr Lys Val Val Lys He Glu Pro Leu Giy He Ala Pro Thr Lys 

450 455 460 

Ala Lys Arg Arg Val Val Gin Arg Siu Ly5 Arg Ala Val Giy He Giy 
465 470 475 480 

Ala Met Phe Leu Giy Fhe Leu Giy Ala Ala Giy Ser Thr Met Giy Ala 

4 85 4 90 4 95 

Ala Ser Leu Thr Leu Thr Val Gin Ala Ara Gin Leu Leu Ser Giy He 

500 505 510 

Val Gin Gin Gin Asn Asn Leu Leu Arg Ala He Glu Ala Gin Gin His 

515 520 525 

Leu Leu Gin Leu Thr Val Trp Giy lie Lys Gin Leu Gin Ala Arg Val 

530 535 540 

Leu Ala Leu Glu Arg Tyr Leu Gin Asp Gin Arg Phe Leu Giy Met Trc 

550 555 56C 

Giy Cys Ser Giy Lys Leu He Gys Thr Thr Ala Val Pro Trp Asn Ala 

565 57C 575 

Ser Trp Ser Asn Lys Asn Leu Ser Gin He Trp Asp Asn Met Thr Trp 

580 585 590 

Met Glu Trp Glu Arg Glu He Ser Asn Tyr Thr Glu He lie Tvr Ser 

595 600 ' 605 

Leu He Glu Glu Ser Gin Asn Gin Gin Glu Lys Asn Glu Leu Asd Leu 

610 615 " 620 

Leu Gin Leu Asp Lys Trp Ala Ser Leu Trp Asn Trp Phe Asn He Thr 
625 630 * 635 64C 

Asn Trp Leu Trp Tyr He Lys He Phe He Met lie Val Giy Giy Leu 

645 650 655 

lie Giy Leu Arg He Val Phe Thr Val Leu Ser He Val Asn Arg Val 

6 6 0 t65 67 0 

Arg Gin Giy Cys Ser Pro Leu 3er the Gin Thr Arq Leu Pro Val Arg 
^ 7 5 68 0 68 5 
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212^> DNA 

2 13 * Artificial Sequence 
220 ■ 

121 ■ CDS 

122 • { 1 } ... (2469) 
<40C- 71 

act age gcg gec gac cgc ctg tgg gtg ace gtg :ac tac ggc gtg ccc 4 3 

Ala Ser Al.i Ala Aso Arg Leu Trp Val Tar Val Tvr Tvr Gly Val Pro 

1 5 10 ' ' 15 

cJTcj tgg aag gac gec acc ac: acc ctcr ttc tgc gec age gac gec aag 3^ 

Val Trp Lys Asp Ala Thr Thr Thr Leu Pr.e Cys Ala Ser Asd Ala Lvs 

20 25 3C 

gec tac gac acc gag gtg cac aac gtg tgg gec ace esc gcg tgc gtg 14-1 

Ala Tyr Asp Thr Glu Val His Asn Val Trp Ala Thr His Ala Cys Val 

3 5 4 0 4 5 

ccc acc gac ccc aac ccc cag gag gtg gtg ctg ggc aac gtg acc gag 192 

Pro Thr Asp Pro Asn Pro Gin Glu Val Val Leu Gly Asn Val Thr Giu 

50 55 6D 

aac ttc aac at g ggc aag aac aac atg gtg gag cag atg cac gag gat 240 

Asn Phe Asn Met Gly Lys Asn Asn Met Val Glu Gin Met His Glu Asp 

65 70 75 80 

ate ate age ctg tgg gac cag age ctg aag ccc tg: gtg aag ctg acc 288 

lie lie Ser Leu Trp Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr 

85 90 95 

cce ctg tgc gtg acc ctg aac tgc ace aag ctg aag aae age acc gac 336 

Pro Leu Cys Val Thr Leu Asn Cys Thr Lys Leu Lys Asn Ser Thr Asp 

100 105 110 

ace aae aac acc cgc tgg ggc acc tag gag atg aag aar tgc age tte 38 4 

Thr Asn Asn Thr Arg Trp Gly Thr Gin Giu Met Lvs Asn Cys Ser Phe 

115 120 ^ 125 

aac ate age acc age gtg cgc aac aag atg aag cgc gag tac gee ctg 4 32 

Asn lie Ser Thr Ser Val Arg Asn Lvs Met Lvs Arg Gi a Tvr Ala Leu 

130 135 140 

ttc tac age ctg ga: ate gtg ccc ate gac aac gac aae acc age tac 48 0 

Phe Tyr Ser Leu Asp lie Val Pro lie Asp Asn Asp Asn Thr Ser Tyr 

14 5 150 155 160 

cgc ctg cgc age tgc aac aca teg ate ate acc cag gee tgc cc: aag 528 

Are Leu Arg Ser Cys Asn Thr Ser He lie Thr Gin Ala Cys Pro Lys 

165 170 175 

gta age tte gag eee ate ccc ate cac ttc tgc gec ccc gec gge ttc ' " -■■ 

Val Ser Phe Glu Pre lie Pro He His Phe Cys Ala Pro Ala Gl / Phe 

160 165 ' 190 

gee ate ctg aag tgc aac aac aag acc ttc aac gge acc ggc cc: tgc *25 

Ala He Leu Lys Cys Asn Asn Lys Thr Phe Asn Gly Thr Gly Pro Cys 

195 2 00 20 5 
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acc aac gtg age acc gtg cag tgc acc cac gga att cgc ;c: gtg gtg 672 

Thr Asn Val Ser Thr Val Gin Cys Tnr His Sly He Arg Pre- Vai Val 

21 j 215 220 

age ace eag ct; etc ctg aa: gge age ctg gee gag gag gag gtg gtg HO 

Ser Thr Oln Let: Leu Leu Asa Gly Ser Leu Ala Gin Gla Glu Val Val 

22 5 2 30 2 3;. 24 0 

ate ags ::t gag aac tte ace aac aac gee aag ace ate ate gtg cag ''63 

Jle Arg 3er Glu Asn Phe Tnr Asn Asn Ala Lys Thr lie lie Val Gin 

2-1 5 2 5 0 2 5 5 

ctg aa: gag age gtg gag ate aac tgc 3:c ege ece aac aa : aac ace ti 1 6 

Leu Asn 31 a Ser Val Glu lie Asn Cys Tnr Arg Pro Asn Asn Asn Thr 

2 60 26 3 ' 27 3 

cgc aa 3 age = t e ::a e ate gge ect ggc ege gee tee tae ac: acc gge 8 64 

Arg Lys Ser He .-lis He 3ly Pro Gly Arg Ala Pne Tyr Thr Tnr Gly 

27 5 2 80 23 5 

gac ate acc gge ga e ate egc cag gee eac egc aac ate tct aga acc 912 

Asp He He Gly Aso He Arg Gin Ala Hie Cys Asn lie Ser Arg Thr 

290 295 300 

aac tgg acc aac ace ctg aag cgc gtg gee gag aag ctg cgc gag aag 960 

Asn Trp Thr Asn Thr Leu Lys Arg Val ALa Glu Lys Leu Arg Glu Lys 

:05 310 315 320 

etc aac aac acc ace ate gtg ttc aac cag age tec gge gge gae ccc 1008 

Phe Asn Asn Thr Thr He Val Phe Asn Gin Ser Ser Gly Gly Asp Pro 

325 3 3C 335 

gag ate gtg atg ea e age ttc aac tgc gge gge gag tte ttc tae tgc 1056 

Glu He 7a L Met Hls Ser Phe Asn Cys Gly Gly Glu Phe Phe Tyr Cys 

340 345 350 

aac ace acc cag ct g tte aac age acc tgc aac gag acc aac age gag 110 4 

Asn Thr Thr Gin Le j Phe Asn Ser Thr Trp Asn Glu Thr Asn Ser Glu 

355 360 365 

gge aac ate act agt gge ace ate acc ctg ece tgc ege ate aag cag 115 2 

Gly Asn lie Thr Ser Gly Thr He Thr Leu Pre. Cys Arg He Lys Gin 

370 375 380 

ate ate aac atg tgg cag gag gtg gge aag ccc atg tae gee ccc ece 12-"'. 

He He Asn Met Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pro 

385 390 395 4 00 

ate gge gge cag at e aag egc etg age aac ate acc gge etg ct g ctg : 2 ■', - 

He Gly Gly Gin He Lys Cys Leu Ser Asn He Thr Gly Leu Leu Leu 

4>) 5 4 10 -115 

acc ege gac gge gge age gac aac teg age age gge aag gag ate tte ". . 

Thr Arc Asp Gly G_lv Ser Asp Asn Ser Ser Ser Gly Lys Glu He Phe 

420 425 ' 430 

ege ceo gge gge gee gac atg cgc gac aac tgc cgc age aa 3 ctg tae : : 

Arg Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Arg Ser Glu Leu Tyr 
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440 



445 



a a g Zi 
Lvs T\ 



c aag gtg 



g t g a a g 
'/a I Lys 



ate 
lie 
455 



gag 
Glu 



ctg ggc 
Leu Gly 



ate 
lie 
4 60 



gec ccc ac: aag 
Ala Pro Thr Lys 



1392 



gec aag ;gc cgc gtg gtg cag cgc gag aag cgc gec gtg ggc at: 
Ala Lys Arg Arg Vai Val Gin Arg Glu Lys Arg Ala Val 31 y lie 
4 6 5 4 7 C 4 7 5 



ggc 
Gly 
430 



14 4 0 



get a t g ttc etc 
Aia Met Phe Leu 



ggc ttc 
Gly Phe 
4 85 



ctg 
Leu 



ggc get 
Gly Ala 



gca ggc 
Ala Hy 
4 90 



age 
Ser 



acc atg ggc 
Thr Met Gly 
4 95 



gec 
Ala 



gec aq: ctg acc :.:tg ace 
Ala Ser Leu Thr Leu Thr 

o n 



gtg 
Val 



cag gee 
Gin Ala 

505 



cgc zag 
Arg Gin 



ctg 
Leu 



ctg aqz ggc 
Leu 3er Glv 



ate 
He 



15: 



g t g c a g ..? a g cag 3 a c a a c ctg 
Val Gin Gin Gin Asn Asn Leu 



ctg cgc 
Leu Arg 
520 



gec acc 
Ala He 



gag 
Glu 



gec cag cag 
Ala Gin Gin 

52 5 



cac 
His 



15S4 



ctg etc cag ctg acc gtg tgg ggc ate aag cag etc 
Leu Leu Gin Leu Thr Val Trp Gly He Lys Gin Leu 
530 535 ^ 540 



cag qc2 cgc 
Gin Ala Arg 



gtg 
Val 



1632 



ctg get eta gag cgc tac etc cag gac cag cgc ttc 
Leu Ala Leu Glu Arg Tyr Leu Gin Asp Gin Arg Phe 
545 550 1 555 



ctg ggc atg tgg 
Leu Gly Met Trp 
56C 



1680 



ggc tgc tec ggc 
Gly Cys Ser Gly 



aag ctg ate tgc acc ac g gee gtg ccc tgg aac gec 
Lys Leu He Cys Thr Thr Ala Val Pro Trp Asn Ala 

5 65 57 3 57 5 



1728 



age tgg age aac 
Ser Trp Ser Asn 

59 J 



aag aac 
Lvs Asn 



ctg 
Leu 



age cag 
Ser Gin 
53 5 



ate tgg .gac aac atg acc tgg 
He Trp Asp Asn Met Thr Trp 
590 



17^6 



atg gag tgg gag 
Met Glu Trp Gia 



cgc gag ate age aac tac acc gag ate ate tac age 
Arg Glu He Ser Asn Tyr Thr Glu He He Tyr Ser 
600 60 5 



182 4 



ctg ate gag gag age cag aa: cag cag gag aag aac 
Leu He Giu Glu Ser Gin Asn Gin Gin Glu Lys Asn 
61C 615 62 0 



gag ctg gac ctg 
Giu Leu Asp Leu 



e^c cag ctg gac 
Leu Gin Leu Asp 
62 5 



aag tgg gca age ttg tgg aac tgg ttc aac ate ace 
Lys Trp Ala Ser Leu Trp Asn Trp Phe Asn lie Thr 
63C 635 640 



1920 



aac tgg eta tgg tac ate 
Asn Tre Leu Tro Tvr lie 



aag 

Lys 



att ttc 
He Phe 



ate atg 
He Met 

650 



ate gtg ggc ggc ctg 
He Val Gly Gly Leu 
655 



1 < - •• 



eg:: eig cue ate gtg ttc acc gtg ctg age ate gtg aac cgc gtg 
Gly Leu Arg He Val Phe Thr Val Leu Ser He Vai Asn Arg Vai 

6 60 66 5 67 0 



j..c cac ggc tac .: : .: ccc ctg age ttc cag acc cgc ctg ccc gtg ccc 
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Arg Gin Gly 7yr Ser Pro Leu Ser ?ne Gin Tnr Arg Leu Pre Vai Pro 
675 680 685 

cgc ggc ccc gac cgc ccc gag ggc ate gag gag gag ggc cgc gag cgc 2112 
Arg Gly Pro Asp Arg Pro Glu Gly lie He Glu Glu Glv G2v Glu Arg 
690 695 700 

gac cgc gac cgc age acc cgc ctg gtg acc ggc ttc erg ccc ctg ate 216C 

Asp Arg Asp Arg Ser Thr Arg Leu Val Thr Glv Phe Leu Pre Leu T ^ 

705 710 "7 1 5 

tgg gac gac ctg cgc age ctg ttc ctg ttc age t=c cat cga ttc cqo 2203 

s m> Ast Asp Leu Arg Ser Leu Phe Letr Phe Ser Tyr His Arg Leu Arg 

725 730 ' " 735 

gac ctg ctg ctg ate gtg gee cgc ate gtg gag ctg ctg ggc eag cqc 2256 
Asp Leu Leu Leu He Val Ala Arg He Val Glu Leu Leu Glv Arg Arc 
7 4 0 74 5 - ^ .1 

ggc tgg gag ate ctg aag tac tgg tgg aac ctg etc eag tae tgg age 1304 
Gly Trp Glu He Leu Lys Tyr Trp Trp Asn Leu Leu Gin T/r Trp Ser 
755 760 765 

cag gag ctg aag aac tct gca gtg age ctg ctg aac gee acc gec ate .352 
Gin Glu Leu Lys Asn Ser Ala Val Ser Leu Leu Asn Ala Thr Ala lie " ' 

770 775 780 

gee gtg jzc gag ggc acc gac cgc gtg ate gag gtg gtg oag cgc ate 2400 
Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val Val Hn Arg He 
785 7 90 795 800 

tgg cgc ggc ate ctg cac ate ccc acc cga att cgc cag ggc ttc gag 24 48 

Trp Arg Gly He Leu His He Pro Thr Arg He Arg Gin Gly Phe 
905 810 815 



b J. 



cgc gee ct j ctg taa gga tec 
Arg Ala Lei Leu * Gly Ser 
32 0 



^10> 72 

822 
-12:- PRT 

M3> Artificial Seauence 



•:4 00:- 72 

Ala Ser Ala Ala Asp Arg Leu Trp Val Thr Val Tyr Tyr Glv Val Pre 



5 .u 



Val Trp Lys Asp Ala Thr Thr Thr Leu Phe Cys Ala Ser Asd Ala Ly- 

20 25 ; i0 ' 

Ala Tyr Asp Thr Glu Val His Asn Val Trp Ala Thr His Ala Cys Val 

4 0 4 5 

Pre Thr Asp Pro Asn Pro Gin Glu Val Val Leu Gly Asn Val Thr Glu 

50 55 60 

Asn Phe Asn Met Gly Lys Asn Asn Met Val Glu Gin Met His Glu Asp 
65 7 0 75 80 

He He Ser Leu Trp Asp Gin Ser Leu Lvs Pro Cvs Val Lvs Leu Thr 

85 3b " 95 

Fro Leu Cys Val Thr Leu Asn Cys Thr Lvs Leu Lvs Asn Ser Thr Asc 



.469 
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IOC 105 HO 

Thr Asn Asr. Thr Arg Trp Gly Thr Gin Glu Met Lys Asn Zys Ser Phe 

115 120 125 

Asn He Ser Thr Ser Val Arg Asn Lys Met Lvs Arg Glu Tvr Ala Leu 

130 135 14C 

Phe Tyr Ser Leu Asp lie Val Pro lie Asp Asn Asp Asn Thr Ser T yr 
-5 15C 155 160 

Arg Leu Arc Ser C\s Asn Thr Ser He He Thr Gin Ala Cys Pro Lys 

165 170 175 

Val Ser Pne Glu Pro He Pro He His Phe Cys Ala Pro Ala Gly Pne 

180 1S5 190 

Ala He Leu Lys Cys Asn Asn Lys Tnr Phe Asn Gly Thr Gly Pro Cys 

155 200 * 205 

Thr Asn Val Ser Thr Val Gin Cys Thr His Gly He Arg Pro Val V.il 

-10 215 J 220 

Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Gli Val Val 
225 230 235 240 

lie Arg Ser Glu Asn Phe Thr Asn Asn Ala Lys Thr He He Val G.in 

245 250 ' 255 

Leu Asn Glu Ser Val Glu He Asn Cys Thr Arg Pro Asn Asn Asn Thr 

260 265 270 

Arg Lys Ser He His He Sly Pro Gly Arq Ala Phe Tyr Thr Thr Gly 

2"H. 280 285 

Asp He lie Gly Asp He Arg Gin Ala His Cy.3 Asn He Ser Arg Thr 

290 295 300 

Asn Trp Thr Asn Thr Leu Lys Arg Val Ala Glu Lys Leu Arg Glu Lys 
305 310 315 320 

Phe Asn Asn Thr Thr He Val Phe Asn Gin Ser Ser Giy Gly Asp Pro 

325 330 335 

Glu lie Val Met His Ser Phe Asn Cys Gly Gly Glu Phe Phe Tyr Cys 

3^0 345 350 

Asn Thr Thr Gin Leu Phe Asn Ser Thr Trp Asn Glu Thr Asn Ser Glu 

355 360 365 

Gly Asn He Thr Ser Giy Thr lie Thr Leu Pro Cys Arg He Lys Gin 

3^0 375 380 

He He Asn Met Trp Gin Giu Val Gly Lys Ala Met Tvr Ala Pro Pro 
385 ^ 390 395 " 40C 

He Gly uly Gin He Lys Cys Leu Ser Asn lie Thr Gly leu Leu Leu 

405 410 415 

Thr Arg Asp Gly Gly Ser Asp Asn Ser Ser Ser Giy Lvs Glu He Phe 

'320 425 * 430 

Arg Pro Giy Gly Gly Asp Met Arg Asp Asn Trp Arg Ser Glu Leu Tyr 

435 440 445 

Lys Tyr Lys Val Val Lys He Glu Pro Leu Glv He Ala Pro Thr Lys 

450 455 460 

Ala Lys Arg Arg Val Val Gin Arg Giu Lys Ar: Ala Vai Glv He Gly 
465 470 47; ^ 480 

Ala Met Phe Leu Gly Phe Leu Gly Ala Ala Gly Ser Thr Met Giy Ala 

485 490 495 

Ala Ser Leu Thr Leu Thr Val Gin Ala Arg Gin Leu Leu Ser Gly He 

50C 505 510 

Val Gin Gin Gin Asn Asn Leu Leu Arg Ala He Glu Ala Gin Gin His 

515 520 525 

Leu Lei: Hn Leu Thr Val Trp Giy He Lys Gin Leu Gin Ala Arg Val 

5 30 =35 540 

Leu Ala Leu Glu Arc Tyr Leu Gin Asd Gin Ara Phe Leu Gly Met Trp 
I* 5 ^ 550 " 555 560 

Giy Cys Ser Giy Lys Leu He Cys Thr Thr Ala Val Pro Trp Asn Ala 
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Ser Trp Ser 

Met GIu Trp 
595 

Leu He GIu 
61C 

Leu Glr. Leu 
625 

Asn Trp Leu 

He Gly Leu 

A&g Glr: Gly 
67 5 

Arg Giy Fro 
690 

Asp Ar^ Asc 

705 

Trp Asp Asp 

Asp Leu Le a 

Gly Trp Ha 
755 

Gin GIu Leu 

77 0 
Ala Val Ala 
785 

Trp Arg Gly 
Arg All Leu 



Asn Lvs 
58C 

31 u Arg 



Asp Lys 

Trp Tyr 
64 5 
Arg lie 

n 6 0 

Tyr Ser 

Asp Arg 

Arg Ser 

Leu Arg 
725 
Leu He 
" -1 0 

lie Leu 

Lys Asn 

Glu Gly 

He Leu 
8C5 
Leu Glv 
620 



Asn Leu 

Hu He 

Gin Asn 
615 
Trp Ala 

630 

lie Lys 

Val Fhe 

Fro Leu 

Pro Glu 
695 
Thr Arg 

7 10 

Ser Leu 

Val Ala 

Lys Tyr 

Ser Ala 
775 
Thr Asp 

790 

His He 
Ser 



Ser Gin He Trp 
585 

Ser Asn Tyr Thr 
600 

Gin Gin Glu Lys 

Ser Leu Trp Asn 
635 

He Phe He Met 

650 

Thr Val Leu Ser 
665 

Ser Phe--Gln Thr 
680 

Gly He Glu GIu 

Leu Val Thr Gly 

715 

Phe Leu Phe Ser 

730 

Arg He Val Glu 
745 

Trp Trp Asn Leu 
760 

Val Ser Leu Leu 

Arg Val lie Glu 
795 

Pro Thr Arg He 
810 



Asp 

Glu 

Asn 
62 0 
Trp 

He 

He 



Glu 

700 

Tyr 



Leu 

Asn 
780 
Val 

Arg 



Asn Met 
590 
lie He 

605 

GIu Leu 

Phe Asn 

Val Gly 

Val Asn 
67 0 
Leu Pr:> 
685 

Gly Gly 

Leu P r 3 

His Arg 

Leu Gly 
7 5 0 
Gin Tyr 
765 

Ala Thr 
Val Gin 
Gin Glv 



Thr Trp 

Tyr Ser 

Asp Leu 

He Thr 
640 

Gly Leu 
655 

Arg Val 

Val Pro 

Glu Arg 

Leu He 
720 
Leu Arg 
735 

Arg Arg 

Trp Ser 

Ala He 

Arg He 
800 
Phe Glu 
815 



2 1 i") 
2 1 .1. 



7?. 

1431 
CNA 

Artific: 



al Sequence 



•■T.2 0 ■ 



get 
Ala 
1 

gtg 
Val 



AI 



•:22:. • CDS 

<22?.:- {!).., (1431) 
*:400:- 73 

age gcg gzc gac cgc ctg tgg gtg acc gtg ta= ta: gge gtg cc; 
Ser Ala Ala Asp Arg Leu Trp Val Thr Val Tyr Tyr Gly Val Pr; 



10 



15 



tgg aag gac gec acc acc acc ctg ttc tgc gee age gac gec aag 
Trp Lys Asp Ala Thr Thr Thr Leu Phe Cvs Ala Ser Asp Ala Lys 

^0 25 30 

tac gac azc gag gtg cac aac gtg tgg gee sec cac gcg tgc gtg 
Tyr Asp Thr Glu Val His Asn Val Trp Ala Thr His Ala Cys Val 
— ■ 4 0 4 5 

acc gac ccc aac ccc cag gag gtg gtg ctg gge aae gtg acc gag 
Thr Asp Pro Asn Pro Gin Glu Val Val Leu Glv Asn Val Thr Glu 

5C 55 60 



48 



144 
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aac ::c aac atg ggc aag aac aac atg gtg gaq cag atg cac gag gat 240 
Asn Phe Asn Met Gly Lys Asn Asn Met Val Glu Glr. Met His Glu Asp 
65 70 -75 g 0 

ate ate age ctg tag gac cag age ctg aag cee tgc gtg aag ctg acc 289 
He lie Ser Leu Trp Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr 
35 9C 95 

ccc ctg tgc gtg acc ctg aac tgc acc aag ctg aag aac age acc gac 336 
Pre Leu Cys Val Thr Leu Asn Cys Thr Lys Leu Lys Asn Ser Thr Asp 
100 105 110 

aac aac acc C 3 C tgg ggc acc ca-g--gag atg aag aac tgc age ttc 384 
Thr Asn Asn Thr Arg Trp Gly Thr Glr. Glu Met Lys Asn Cys Ser Phe 
115 120 125 

aac ate age ace age gtg cge aac aag atg aag egc gag tac gee ctg 4 32 

Asn lie Ser Thr Ser Val Arg Asn Lys Met Lys Are Glu Tyr Ala Leu 
130 135 140 

ttc tac age ctg gac ate gtg ccc ate gac aac ga z aac acc age tac 4 30 

Phe Tyr Ser Leu Asp lie Val Pro lie Asp Asn Aso Asn Thr Ser Tyr 
1/J 5 150 155 160 

egc ctg egc age tgc aac aca teg ate ate acc cag gee tgc ccc aag 528 
Arg Leu Arg Ser Cys Asn Thr Ser He He Thr Gin Ala Cys Pro Lys 
1-65 170 175 

gtg age tt: gag ccc ate ccc ate cac ttc tgc gee ccc gec ggc ttc 5"'6 
Val Ser Phe Glu Pro lie Pro He His Phe :ys All Pro Ala Gly Phe 
180 185 190 



624 



gee ate ctg aag tgc aac aac aag acc ttc aac gg: acc ggc ccc tgc 

Ala He Leu Lys Cys Asn Asn Lys Thr Phe Asn Gly Thr Gly Pro Cys 
155 200 205 

aec aac gtg age ace gtg cag tgc a:; cae gga att egc ccc gtg gtg 6~2 

Thr Asn Val Ser Thr Val Gin Cys Thr His Gly He Arg Pro Val Val 

210 215 220 

age ac: cag ctg ctg ctg aac ggc age ctg gee gag gag gag gtg gtg 7; 0 

Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Glu Val Val 
225 230 235 240 

at.c aga tct gag aac ttc acc aac aac gec aag ace ate ate gtg cag 7 L .3 

He Arg Ser Glu Asn Phe Thr Asn Asn Ala Lys Thr He He Val Gin 
245 250 255 

ctg aac gag age gtg gag ate aac tgc ace egc ccc aac aac aac acc 316 

Leu Asn Glu Ser Val Glu He Asn Cys Thr Arg ?r'> Asn Asn Asn Thr 
2 60 2 6 5 27 0 

egc aag age ate cac ate ggc cct ggc egc gec ttc tac acc acc ggc 8 ? 4 

Arg Lys Ser He His lie Gly Pro Gly Arg Ala Phe Tyr Thr Thr Gly 
27 5 280 285 

gac ate ate ggc gac ate egc cag gec cac tgc aac: ate tct aga acc 

Asp He He Gly Asp He Arg Gin Ala His Cys Asn lie Ser Arg Thr 

2^0 2 95 300 
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aac tgg 
Asn Trp 

305 



acc aac acc 
Thr Asn Thr 



ctg aag cgc gtg 
Leu Lys Arg Val 

310 



gcc gag aag ctg 
Ala Glu Lys Leu 
313 



eg- gag aag 
Arj Glu Lys 
320 



960 



ttc aa; 
Phe Asn 



aac acc acc 
Asn Thr Thr 
325 



ate gtg ttc aac 
He Val Phe Asn 



cag age tec gge 
Gin 3er 3er Gly 

330 



qqz gaz ccc 
Gly Asp Pro 
33 5 



1008 



gag at : 

Glu lie 



gtg atg cac 
Val Met His 
340 



age ttc 
Ser Phe 



aac tgc 
Asn Cys 
345 



ggc qgz ;ag ttc 
Gly Gly Hu Phe 



tae tgc 
Phe Tyr Cys 
350 



1 356 



aac ac: 
Asn Thr 



acc rag ctg 
Tnr Sin Leu 
355 



ttc aac 
Phe Asn 



age acc 
Ser Thr 
360 



tgg na ; 
Tro Asn 



5ag acc 
Glu Thr 
365 



aac age gag 
Asn Ser Glu 



1 10< 



ggc aa : 
Gly Asn 
37 ) 



ate ict agt 
He Thr Ser 



ggc acc 
Gly Thr 
375 



ate acc 
He Thr 



ctg cc: 
Leu Pr > 



tgc cgc 
Cys Arg 
380 



ate aag cag 
He Lys Gin 



ate at 7 
lie in 
385 



aa: i*:g tgg 
Asn Met Trp 



cag gag 
Gin Glu 
390 



gtg ggc 
Val Gly 



aag gc ; 
Lys Ala 
395 



a t g t a c 
Met Tyr 



gcc ccc ccc 
Ala Pro Pro 
400 



1200 



ate ggc 
He Giv 



ggc cag ate 
Gly Gin He 
405 



aag tgc 
Lys Cys 



ctg age 
Leu Ser 



aac atc 
Asn He 
410 



acc ggc 
Thr Gly 



ctg ctg ctg 
Leu Leu Leu 
415 



1248 



acc eg- gac ggc ggc age gac aac teg age age ggc aag gag att ttc 
Thr Arj Asp Gly Gly Ser Asp Asn Ser Ser Ser Gly Lys Glu He Phe 
420 425 430 



1296 



cgc cc.; 
Arg Pr- > 



ggc 

31y Gly Gly 
4 35 



gac atg cgc gac 
Asp Met Arg Asp 
440 



aac tgg cgc age 
Asn Trp Arg Ser 
445 



gag ctg tac 
Glu Leu Tyr 



134 4 



aag ca - 
Lys Tyr 

4 5 :■ 



aa7 gtg gtg 
Lys Val Val 



aag ate gag ccc 
Lys He Glu Pro 
455 



ctg ggc ate gcc 
Leu Gly lie Ala 
4 60 



ccc acc aag 
Pro Thr Lys 



1392 



gcc aai 
Ala Lyr. 
465 



cgc cgc gtg 
Arg Arg Val 



gtg cag cgc gag 
Val Gin Arg Glu 
470 



aag cgc gcc tag 
Lys Arg Ala * 
475 



14 31 



2 10 
211 
212 
213 



'■4 
4 76 
PRT 

Artificial Sequence 



■ 400:- -M 

Ala Ser Ala Ala Asp Arg Leu Trp Val Thr 7a - Tyr Tyr Gly Val Pro 

1 5 10 15 

Val Trp Lyj- Asp Ala Thr Thr Thr Leu Fhe Cys Ala Ser Aso Ala Lvs 

20 25 30* 

Ala Tyr Asp Thr Glu Val His Asn Val Trp Ala Thr His Ala Cys Val 

35 40 4 5 

Pro Thr Asp Pro Asn Pro Gin Glu Val Val Leu Gly Asn Val Thr Glu 
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3 J 55 60 

Asn Pr.e Asn Met Gly Lys Asn Asn Met 7a 1 Giu Gin Met His Giu Asp 
6 5 7 j -5 60 

He He Ser Leu Trp Asp Gin Ser Leu Lys Fro Cys Vai Lys Leu Tnr 

3 5 ? C 5 5 

Pro Leu Cys Val Tnr Leu Asn :ys Thr Lys Leu Lys Asn Ser Thr Asp 

100 105 110 

Thr Asn Asn Tnr Arg Trp Gly Thr Gin Giu Met Lys Asn Cys Ser Pne 

115 120 125 

Asn lie Ser Tnr Ser Val Arg Asn Lys Met Lys Arg Giu Tyr Ala Leu 

'-30 135 ^ ^ 140 

Phe Tyr Ser Leu Asp He Val Pro He Asc Asn Asp Asn Thr Ser Tyr 
-=-+5 150 155 160 

Arg Leu Ar^ Ser Cys Asn ^hr Ser lie lie Thr Gin Ala Cys Pro L /s 

165 170 175 

Val Ser ?he Giu Pro I Le Pro lie His Phe Cvs Ala Pro Ala Giv Phe 

13; 185 " 19C 

Ala lie Le i Lys Cys Asn Asn Lys Thr Phe Asn 31 y Thr Gly Pro Cys 

19-; 200 205 

Thr Asn 7a L Ser Thr Val Gin Cys Thr His Gly lie Arg Pro Val Vai. 

-10 215 22 0 

Ser Thr Gi:: Leu Leu Leu Asn Gly Ser Leu Ala 3iu Giu 31u Vai Val 
225 230 235 240 

lie Ar j Ser Giu Asn Phe Thr Asn Asn Ala Lys Thr ile lie Val Gin 

245 250 255 

Leu Asn Giu Ser Val Giu lie Asn Cys Thr Arg Pro Asn Asn Asn Thr 

260 265 270 

Arg Lys Ser lie His lie Gly Pro Gly Arg Ala Phe Tyr Thr Thr Gly 

275 280 285 

Asp lie He Sly Asp lie Arg Gin Ala His Cys Asn lie Ser Arg Thr 

290 295 300 

Asn Trp Thr Asn Thr Leu Lys Arg Val Ala Giu Lys Leu Arg Giu Lys 
305 310 315 ' 32 0 

Phe Asn Asp. Thr Thr lie Val Phe Asn Gin Ser Ser Gly Gly Asp Pro 

325 330 335 

Giu lie Vai Met His Ser Phe Asn Cys Gly Gly Giu Phe Phe Tyr Cv: 

3 4(1 34 5 350 

Asn Thr Thr ;in Leu Phe Asn Ser Thr Trp Asn Giu Thr Asn Ser Giu 

355 360 ' 365 

Gly Asn lie Thr Ser G*y Thr lie Thr Leu Pro Cys Arg lie Lys Glr 

370 375 380 

lie lie Asn Met. Trp Gin Giu Val Gly Lys Ala Met Tyr Ala Pro Pro 
385 390 395 4C( 

lie Gly Gly Gin lie Lys Cys Leu Ser Asn lie Thr Gly Leu Leu Leu 

405 410 ' 415 

Thr Arg Asp Gly Gly Ser Asp Asn Ser Ser Ser Gly Lys Giu lie Phe 

4 2C 425 -1 30 

Arg Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Arg Ser Giu Leu Tyr 

435 440 445 

Lys Tyr Lys Val Val Lys lie Giu Pro Leu Gly lie Ala Pro Thr Lys 

4 50 4 55 4 60 

Ala Lys Arg Arg Val Val Gin Arg Giu Lys Arg Ala 
4 65 4 70 4 75 

■:210-- "\5 
•;211:- 1038 
•:212> :>NA 

<213> Artificial Sequence 
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55 



^0> 

:i> CDS 



■038) 



4 75 

gcc gtg cgc ate ggc get atg ttc etc ggc ttc ctg ggc get gca ggc 48 

Ala VaL Gly lie Gly Ala Met Fhe Leu Gly Phe Leu Gly Ala Ala Gly 

1 5 10 15 

age aec at q ggc gcc gcc age ctg a zc ctg acc gtg cag gee cgc cag 96 
Ser Thr Ket Gly Ala Ala Ser Leu Thr Leu Thr Val Gin Ala Arg Gin 
20 25 30 

ctg ctg age ggc ate gtg cag cag cag aac aac ctg etg cgc gcc ate 14 4 

Leu Leu Ser Gly lie Val Gin Gin Gin Asn Asn Leu Leu Arg Ala lie 

4 0 4 5 

gag gcc cag cag eac ctg etc cag ctg acc gtg tgg ggc ate aag cag 192 
Glu Ala Gin Gin His Leu Leu Gin Leu Thr Val Trp Gly He Lys Gin 
50 5 5 60 

etc cag gcc cgc gtg ctg get eta gag cgc tae cte cag gac cag cgc 240 
Leu Gin Ala Arg Val Leu Ala Leu Glu Arg Tyr Leu Gin Asd Gin Arg 
65 70 75 80 

ttc ctg ggc atg tgg ggc tgc tec ggc aag ctg ate tgc acc acg gcc 288 
Phe Leu Gly Met Trp Gly Cys Ser Gly Lys Leu He Cys Thr Thr Ala 
85 90 95 

gtg ccc tgg aac gcc age tgg age aac aag aac ctg age cag att tgg 336 
Val Pro Trp Asn Ala Ser Trp Ser Asn Lys Asn Leu Ser Gin He Trp 
100 105 HO 

gac aac atg acc tgg atg gag tgg gag cgc gag ate age aac tac acc 384 
Asp Asn Met Thr Trp Met Glu Trp Glu Arg 31u He Ser Asn Tyr Thr 
115 120 125 

gag ate ate tac age ctg ate gag gag age cag aac cag cag gag aag 432 
Glu He He Tyr Ser Leu He Glu Glu Ser Gin Asn Gin Gin Glu Lys 
130 i35 140 

aac gag ctg gac ctg etc cag ctg gac aag tgg gca age ttg tgg aac 4 60 

Asn Glu Lea Asp Leu Leu Gin Leu Asp Lys Trie Ala Ser Leu TrD Asn 
145 150 155 ' 160 



tgg ttc aic ate acc aac tgg ctg tgg tac ate aag att ttc ate at a 

Trp Phe Asn lie Thr Asn Trp Leu Trp Tyr He Lys lie Phe He Met 
165 170 175 

ate gtg gge ggc ctg ate ggc ctg cgc ate gtg ttc acc gtg ctg age 

lie Vjl Gly Gly Leu He Gly Leu Arg He Val. Phe Thr Val Leu Ser 

180 185 190 

ate gtg aac cgc gtg cgc cag ggc tac age ccc ctg 3gc ttc cag acc 

He Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Phe Gin Thr 

1-^5 200 205 

cgc etg ccc gtg ccc cgc ggc ccc gac cgc ccc gag ggc ate gag gag 

Arg Leu Pro Val Pro Arg Gly Pro Asp Arg Pro Glu Gly He Glu Glu 
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215 



220 



gag ggc gg; 
Giu Sly Gl; 
225 



gaa cgc gac cgc gac cge ag=: 
Glu Arg Asp Arg Asp Arg Se: 



acc cgc ctg gtg acc ggc 
Thr Arg Leu 7a 1 Thr Giy 
235 24C 



7 20 



g cc: ctg ate tgg gac gac ctg cgc 
Pne lee Pr; Leu lie Trp Asp Asp Leu Arg 
245 250 



age ctg tt: ctg 
Ser Leu Phe Leu 



ttc age 
?he Ser 
255 



763 



t ac tat eg a 
Tyr His Ar j 



ttg cgc gac ctg ctg 
Leu Ara Asd Leu Leu 



ctg ate gtg gee eg: ate gtg gag 
Leu lie Val Ala Ar 3 lie 7al Glu 
26c 270 



816 



ctg ztj eg: egg tec ggt tgg gag ate ctg 
Leu Leu Gl / Arg Arg Giy Trp Giu lie Leu 
275 " ' 230 



aag tac to 7 tgg aac ctg 
Lys Tyr Trp Trp Asn Leu 
28 5 



S64 



etc cag tt : * ) ; age ta g gag ctg aag aae 
Leu Gin Tyr Trp Ser Gin Giu Leu Lys Asn 
290 295 



tet gca gt g : : ge ctg ctg 
Ser Ala Val Ser Leu Leu 

300 



9i; 



aac gec ace 
Asn Ala Thr 
305 



gee ate gee gtg gee 
Ala lie Ala Val Aia 
.310 



gag ggc acc gac eg: gtg ate gag 
Gli Giy Thr Asp Arg Vai lie Glu 
315 320 



960 



gtg gtg cag cgc ate tgg cgc ggc at: ctg 
Vai Vai Gin Arg He Trp Arg Giy He Leu 
325 330 



cac ate ccc acc 
His He Pro Thr 



ega att 
Arg lie 
335 



1008 



cgc cag ggc 
Arg Gin Giy 



Phe 
5 4 0 



gag eg; gec ctg 
Glu Arg Ala Leu 



ctg taa 
Leu * 

345 



1038 



7 b 
34 5 
PRT 



2L3:- Artificial Sequence 



<-100t 76 

Ala Val Glv lie Giy Ala Met Phe Leu Glv 

1 5 

Ser Thr Met Giy Ala Aia Ser Leu Thr Leu 

20" 25 
Leu Leu Ser Glv lie Vai Gin Gin Gin Asn 

35 40 
Glu Aia Gir. Gin His Leu Leu Gin Leu Thr 

50 55 
Leu Gir. Ala Arg Val Leu Ala Leu Glu Ara 
65 -7 0 

Phe Leu Giy Met Trp Giy Cys Ser Giy Lys 

3 5 90 
Val Pre Trp Asn Ala Ser Trp Ser Asn Lys 

100 105 
Asp Asn Met Thr Trp Met Glu Trp Giu Arg 

115 120 
Glu He He Tyr Ser Leu He Glu Glu Ser 
130 135 



Phe Leu Giy Ala Ala Giy 
15 

Thr Val Gin Ala Arg Gin 

50 

Asn Leu Leu Arg Ala Lie 
4 5 

Sly He Lys Gin 



Val Trp 

60 

Tyr Leu 

75 

Leu lie 

Asn Leu 

Glu He 

Gin Asn 
140 



Gin Asp Gin Arg 

80 

Cys Thr Thr Ala 

35 

Ser Gin lie Trp 
110 

Ser Asn Tyr Thr 

125 

Gin Gin Glu Lys 



WO 00/29561 PCT/DK00/00144 

57 

Asn Glu Leu Asp Leu Leu Gin Leu Asd Lvs Ttd Ala 3er Leu Trp Asn 
145 150 155 160 

Trp Phe Asn lie Thr Asn Trp Leu Trp Tyr He Lys He Phe He Met 

165 170 175 

He Val Gly Gly Leu He Gly Leu Arg He Val Fhe Thr Val Leu 3er 

180 18 5 190 

He Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Phe Gin Thr 

195 200 205 

Arg Leu Pro Val Pro Arg Gly Pro Asp Ar; Pro Giu Glv He Glu Glu 

210 215 22C 

Glu Gly Gly Glu Arg Asp Arg Asp Arg Ser Thr Arg Leu Val Thr Gly 
225 230 235 240 

£be Leu Pro Leu He Trp Asp Asp Leu-Arg Ser Leu Phe Leu Phe Ser 

245 250 255 

Tyr His Arg Leu Arg Asp Leu Leu Leu He Val Ala Arg He Val Glu 

260 265 270 

Leu Leu Gly Ara Arg Gly Trp Glu He Leu Lys Tyr Trp Trp Asn Leu 

275 280 " 285 

Leu Gin Tyr Trp Ser Gin Glu Leu Lys Asn Ser Ala Val Ser Leu Leu 

290 295 300 

Asn Ala Thr Ala He Ala Val Ala Glu Glv Thr Asd Arg Val lie Glu 
305 310 315 ' " 320 

Val Val Gin Arg He Trp Arg Gly He Leu His He Pro Thr Arg lie 

325 330 335 

Arg Gin Gly Phe Glu Arg Ala Leu Leu 
340 345 



